Options

[Solved] A logical challenge

qwertzqwertz Member Posts: 130 Contributor II
edited November 2018 in Help
Dear all,

today I come up with question which I assume can be solved with RapidMiner for sure but I don't get the logic approach...


I have this kind of table structure:

row predic date v1 v2 tag
1 6211 2006 5861 87 v1
2 6215 2007 6010 91 v1
3 105 2006 5845 100 v2
4 98 2007 5495 88 v2


And I am looking for a process that does the following calculation for all rows:
predic = (predic - attribute column named in tag) / attribute column named in tag

Example with numbers:
for row 1: predic new = (6211 - 5861) / 5861
for row 4: predic new = (98 - 88) / 88


The major challenge to me is how to receive the value of an attribute which is defined by a tag in another attribute.
I need the result of the calculated data in the same attribute ("predic") because I want to pivotize the collection afterwards.


Cheers
Sachs

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.003">
 <context>
   <input/>
   <output/>
   <macros/>
 </context>
 <operator activated="true" class="process" compatibility="5.2.003" expanded="true" name="Process">
   <process expanded="true" height="672" width="748">
     <operator activated="true" class="generate_data" compatibility="5.2.003" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30"/>
     <operator activated="true" class="select_attributes" compatibility="5.2.003" expanded="true" height="76" name="Select Attributes (2)" width="90" x="179" y="30">
       <parameter key="attribute_filter_type" value="single"/>
       <parameter key="attribute" value="label"/>
       <parameter key="invert_selection" value="true"/>
       <parameter key="include_special_attributes" value="true"/>
     </operator>
     <operator activated="true" class="multiply" compatibility="5.2.003" expanded="true" height="94" name="Multiply" width="90" x="313" y="30"/>
     <operator activated="true" class="loop_parameters" compatibility="5.2.003" expanded="true" height="94" name="Loop Parameters" width="90" x="447" y="30">
       <list key="parameters">
         <parameter key="Windowing.label_attribute" value="att1,att2"/>
       </list>
       <process expanded="true" height="416" width="748">
         <operator activated="true" class="series:windowing" compatibility="5.1.002" expanded="true" height="76" name="Windowing (2)" width="90" x="45" y="165">
           <parameter key="window_size" value="1"/>
         </operator>
         <operator activated="true" class="series:windowing" compatibility="5.1.002" expanded="true" height="76" name="Windowing" width="90" x="45" y="30">
           <parameter key="horizon" value="1"/>
           <parameter key="window_size" value="1"/>
           <parameter key="create_label" value="true"/>
           <parameter key="label_attribute" value="att2"/>
         </operator>
         <operator activated="true" class="series:sliding_window_validation" compatibility="5.1.002" expanded="true" height="112" name="Validation" width="90" x="179" y="30">
           <parameter key="training_window_width" value="20"/>
           <parameter key="training_window_step_size" value="5"/>
           <parameter key="test_window_width" value="20"/>
           <parameter key="horizon" value="5"/>
           <process expanded="true" height="371" width="321">
             <operator activated="true" class="support_vector_machine" compatibility="5.2.003" expanded="true" height="112" name="SVM" width="90" x="115" y="30"/>
             <connect from_port="training" to_op="SVM" to_port="training set"/>
             <connect from_op="SVM" from_port="model" to_port="model"/>
             <portSpacing port="source_training" spacing="0"/>
             <portSpacing port="sink_model" spacing="0"/>
             <portSpacing port="sink_through 1" spacing="0"/>
           </process>
           <process expanded="true" height="371" width="321">
             <operator activated="true" class="apply_model" compatibility="5.2.003" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
               <list key="application_parameters"/>
             </operator>
             <operator activated="true" class="series:forecasting_performance" compatibility="5.1.002" expanded="true" height="76" name="Performance" width="90" x="183" y="30">
               <parameter key="horizon" value="1"/>
             </operator>
             <connect from_port="model" to_op="Apply Model" to_port="model"/>
             <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
             <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
             <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
             <portSpacing port="source_model" spacing="0"/>
             <portSpacing port="source_test set" spacing="0"/>
             <portSpacing port="source_through 1" spacing="0"/>
             <portSpacing port="sink_averagable 1" spacing="0"/>
             <portSpacing port="sink_averagable 2" spacing="0"/>
           </process>
         </operator>
         <operator activated="true" class="apply_model" compatibility="5.2.003" expanded="true" height="76" name="Apply Model (2)" width="90" x="313" y="120">
           <list key="application_parameters"/>
         </operator>
         <operator activated="true" class="generate_attributes" compatibility="5.2.003" expanded="true" height="76" name="Generate Attributes" width="90" x="447" y="120">
           <list key="function_descriptions">
             <parameter key="Tag" value="param(&quot;Windowing&quot;, &quot;label_attribute&quot;)"/>
           </list>
         </operator>
         <connect from_port="input 1" to_op="Windowing" to_port="example set input"/>
         <connect from_port="input 2" to_op="Windowing (2)" to_port="example set input"/>
         <connect from_op="Windowing (2)" from_port="example set output" to_op="Apply Model (2)" to_port="unlabelled data"/>
         <connect from_op="Windowing" from_port="example set output" to_op="Validation" to_port="training"/>
         <connect from_op="Validation" from_port="model" to_op="Apply Model (2)" to_port="model"/>
         <connect from_op="Apply Model (2)" from_port="labelled data" to_op="Generate Attributes" to_port="example set input"/>
         <connect from_op="Generate Attributes" from_port="example set output" to_port="result 1"/>
         <portSpacing port="source_input 1" spacing="0"/>
         <portSpacing port="source_input 2" spacing="0"/>
         <portSpacing port="source_input 3" spacing="0"/>
         <portSpacing port="sink_performance" spacing="72"/>
         <portSpacing port="sink_result 1" spacing="0"/>
         <portSpacing port="sink_result 2" spacing="0"/>
       </process>
     </operator>
     <operator activated="true" class="append" compatibility="5.2.003" expanded="true" height="76" name="Append" width="90" x="581" y="30"/>
     <connect from_op="Generate Data" from_port="output" to_op="Select Attributes (2)" to_port="example set input"/>
     <connect from_op="Select Attributes (2)" from_port="example set output" to_op="Multiply" to_port="input"/>
     <connect from_op="Multiply" from_port="output 1" to_op="Loop Parameters" to_port="input 1"/>
     <connect from_op="Multiply" from_port="output 2" to_op="Loop Parameters" to_port="input 2"/>
     <connect from_op="Loop Parameters" from_port="result 1" to_op="Append" to_port="example set 1"/>
     <connect from_op="Append" from_port="merged set" to_port="result 1"/>
     <portSpacing port="source_input 1" spacing="0"/>
     <portSpacing port="sink_result 1" spacing="0"/>
     <portSpacing port="sink_result 2" spacing="0"/>
   </process>
 </operator>
</process>

Answers

  • Options
    awchisholmawchisholm RapidMiner Certified Expert, Member Posts: 458 Unicorn
    Hello


    Something like this?...
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.2.006">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.2.006" expanded="true" name="Process">
        <process expanded="true" height="672" width="882">
          <operator activated="true" class="generate_data" compatibility="5.2.006" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30"/>
          <operator activated="true" class="select_attributes" compatibility="5.2.006" expanded="true" height="76" name="Select Attributes (2)" width="90" x="45" y="120">
            <parameter key="attribute_filter_type" value="single"/>
            <parameter key="attribute" value="label"/>
            <parameter key="invert_selection" value="true"/>
            <parameter key="include_special_attributes" value="true"/>
          </operator>
          <operator activated="true" class="multiply" compatibility="5.2.006" expanded="true" height="112" name="Multiply" width="90" x="45" y="210"/>
          <operator activated="true" class="loop_parameters" compatibility="5.2.006" expanded="true" height="94" name="Loop Parameters" width="90" x="179" y="210">
            <list key="parameters">
              <parameter key="Windowing.label_attribute" value="att1,att2"/>
            </list>
            <process expanded="true" height="416" width="748">
              <operator activated="true" class="series:windowing" compatibility="5.1.002" expanded="true" height="76" name="Windowing (2)" width="90" x="45" y="165">
                <parameter key="window_size" value="1"/>
              </operator>
              <operator activated="true" class="series:windowing" compatibility="5.1.002" expanded="true" height="76" name="Windowing" width="90" x="45" y="30">
                <parameter key="horizon" value="1"/>
                <parameter key="window_size" value="1"/>
                <parameter key="create_label" value="true"/>
                <parameter key="label_attribute" value="att2"/>
              </operator>
              <operator activated="true" class="series:sliding_window_validation" compatibility="5.1.002" expanded="true" height="112" name="Validation" width="90" x="179" y="30">
                <parameter key="training_window_width" value="20"/>
                <parameter key="training_window_step_size" value="5"/>
                <parameter key="test_window_width" value="20"/>
                <parameter key="horizon" value="5"/>
                <process expanded="true" height="371" width="321">
                  <operator activated="true" class="support_vector_machine" compatibility="5.2.006" expanded="true" height="112" name="SVM" width="90" x="115" y="30"/>
                  <connect from_port="training" to_op="SVM" to_port="training set"/>
                  <connect from_op="SVM" from_port="model" to_port="model"/>
                  <portSpacing port="source_training" spacing="0"/>
                  <portSpacing port="sink_model" spacing="0"/>
                  <portSpacing port="sink_through 1" spacing="0"/>
                </process>
                <process expanded="true" height="371" width="321">
                  <operator activated="true" class="apply_model" compatibility="5.2.006" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
                    <list key="application_parameters"/>
                  </operator>
                  <operator activated="true" class="series:forecasting_performance" compatibility="5.1.002" expanded="true" height="76" name="Performance" width="90" x="183" y="30">
                    <parameter key="horizon" value="1"/>
                  </operator>
                  <connect from_port="model" to_op="Apply Model" to_port="model"/>
                  <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
                  <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
                  <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
                  <portSpacing port="source_model" spacing="0"/>
                  <portSpacing port="source_test set" spacing="0"/>
                  <portSpacing port="source_through 1" spacing="0"/>
                  <portSpacing port="sink_averagable 1" spacing="0"/>
                  <portSpacing port="sink_averagable 2" spacing="0"/>
                </process>
              </operator>
              <operator activated="true" class="apply_model" compatibility="5.2.006" expanded="true" height="76" name="Apply Model (2)" width="90" x="313" y="120">
                <list key="application_parameters"/>
              </operator>
              <operator activated="true" class="generate_attributes" compatibility="5.2.006" expanded="true" height="76" name="Generate Attributes" width="90" x="447" y="120">
                <list key="function_descriptions">
                  <parameter key="Tag" value="param(&quot;Windowing&quot;, &quot;label_attribute&quot;)"/>
                </list>
              </operator>
              <connect from_port="input 1" to_op="Windowing" to_port="example set input"/>
              <connect from_port="input 2" to_op="Windowing (2)" to_port="example set input"/>
              <connect from_op="Windowing (2)" from_port="example set output" to_op="Apply Model (2)" to_port="unlabelled data"/>
              <connect from_op="Windowing" from_port="example set output" to_op="Validation" to_port="training"/>
              <connect from_op="Validation" from_port="model" to_op="Apply Model (2)" to_port="model"/>
              <connect from_op="Apply Model (2)" from_port="labelled data" to_op="Generate Attributes" to_port="example set input"/>
              <connect from_op="Generate Attributes" from_port="example set output" to_port="result 1"/>
              <portSpacing port="source_input 1" spacing="0"/>
              <portSpacing port="source_input 2" spacing="0"/>
              <portSpacing port="source_input 3" spacing="0"/>
              <portSpacing port="sink_performance" spacing="72"/>
              <portSpacing port="sink_result 1" spacing="0"/>
              <portSpacing port="sink_result 2" spacing="0"/>
            </process>
          </operator>
          <operator activated="true" class="append" compatibility="5.2.006" expanded="true" height="76" name="Append" width="90" x="179" y="345"/>
          <operator activated="true" class="rename_by_replacing" compatibility="5.2.006" expanded="true" height="76" name="Rename by Replacing" width="90" x="179" y="435">
            <parameter key="replace_what" value="^(.*)-.*$"/>
            <parameter key="replace_by" value="$1"/>
          </operator>
          <operator activated="true" class="loop_examples" compatibility="5.2.006" expanded="true" height="94" name="Loop Examples" width="90" x="313" y="390">
            <process expanded="true" height="809" width="1091">
              <operator activated="true" class="filter_example_range" compatibility="5.2.006" expanded="true" height="76" name="Filter Example Range" width="90" x="167" y="79">
                <parameter key="first_example" value="%{example}"/>
                <parameter key="last_example" value="%{example}"/>
              </operator>
              <operator activated="true" class="extract_macro" compatibility="5.2.006" expanded="true" height="60" name="Extract Macro" width="90" x="179" y="165">
                <parameter key="macro" value="tagValue"/>
                <parameter key="macro_type" value="data_value"/>
                <parameter key="attribute_name" value="Tag"/>
                <parameter key="example_index" value="1"/>
              </operator>
              <operator activated="true" class="generate_attributes" compatibility="5.2.006" expanded="true" height="76" name="Generate Attributes (2)" width="90" x="447" y="165">
                <list key="function_descriptions">
                  <parameter key="newAtt" value="%{tagValue}-att3"/>
                </list>
              </operator>
              <connect from_port="example set" to_op="Filter Example Range" to_port="example set input"/>
              <connect from_op="Filter Example Range" from_port="example set output" to_op="Extract Macro" to_port="example set"/>
              <connect from_op="Filter Example Range" from_port="original" to_port="example set"/>
              <connect from_op="Extract Macro" from_port="example set" to_op="Generate Attributes (2)" to_port="example set input"/>
              <connect from_op="Generate Attributes (2)" from_port="example set output" to_port="output 1"/>
              <portSpacing port="source_example set" spacing="0"/>
              <portSpacing port="sink_example set" spacing="0"/>
              <portSpacing port="sink_output 1" spacing="0"/>
              <portSpacing port="sink_output 2" spacing="0"/>
            </process>
          </operator>
          <operator activated="true" class="append" compatibility="5.2.006" expanded="true" height="76" name="Append (2)" width="90" x="447" y="390"/>
          <connect from_op="Generate Data" from_port="output" to_op="Select Attributes (2)" to_port="example set input"/>
          <connect from_op="Select Attributes (2)" from_port="example set output" to_op="Multiply" to_port="input"/>
          <connect from_op="Multiply" from_port="output 1" to_op="Loop Parameters" to_port="input 1"/>
          <connect from_op="Multiply" from_port="output 2" to_op="Loop Parameters" to_port="input 2"/>
          <connect from_op="Loop Parameters" from_port="result 1" to_op="Append" to_port="example set 1"/>
          <connect from_op="Append" from_port="merged set" to_op="Rename by Replacing" to_port="example set input"/>
          <connect from_op="Rename by Replacing" from_port="example set output" to_op="Loop Examples" to_port="example set"/>
          <connect from_op="Loop Examples" from_port="output 1" to_op="Append (2)" to_port="example set 1"/>
          <connect from_op="Append (2)" from_port="merged set" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    regards

    Andrew
  • Options
    qwertzqwertz Member Posts: 130 Contributor II

    Hi!

    It works! That's it!  8)

    I modified the code a little accoding my formula but it's your code that made this process fly :D

    Though, I have to admit that I yet not fully understand what it does...
    Anyways... Something left to discover for tomorrow ::)


    Thanks again!

    Bye
    Sachs


    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.2.006">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.2.006" expanded="true" name="Process">
        <process expanded="true" height="672" width="882">
          <operator activated="true" class="generate_data" compatibility="5.2.006" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30">
            <parameter key="number_examples" value="25"/>
          </operator>
          <operator activated="true" class="select_attributes" compatibility="5.2.006" expanded="true" height="76" name="Select Attributes (2)" width="90" x="179" y="30">
            <parameter key="attribute_filter_type" value="single"/>
            <parameter key="attribute" value="label"/>
            <parameter key="invert_selection" value="true"/>
            <parameter key="include_special_attributes" value="true"/>
          </operator>
          <operator activated="true" class="multiply" compatibility="5.2.006" expanded="true" height="94" name="Multiply" width="90" x="313" y="30"/>
          <operator activated="true" class="loop_parameters" compatibility="5.2.006" expanded="true" height="94" name="Loop Parameters" width="90" x="447" y="30">
            <list key="parameters">
              <parameter key="Windowing.label_attribute" value="att1,att2"/>
            </list>
            <process expanded="true" height="416" width="748">
              <operator activated="true" class="series:windowing" compatibility="5.1.002" expanded="true" height="76" name="Windowing (2)" width="90" x="45" y="165">
                <parameter key="window_size" value="1"/>
              </operator>
              <operator activated="true" class="series:windowing" compatibility="5.1.002" expanded="true" height="76" name="Windowing" width="90" x="45" y="30">
                <parameter key="horizon" value="1"/>
                <parameter key="window_size" value="1"/>
                <parameter key="create_label" value="true"/>
                <parameter key="label_attribute" value="att2"/>
              </operator>
              <operator activated="true" class="series:sliding_window_validation" compatibility="5.1.002" expanded="true" height="112" name="Validation" width="90" x="179" y="30">
                <parameter key="training_window_width" value="4"/>
                <parameter key="training_window_step_size" value="2"/>
                <parameter key="test_window_width" value="4"/>
                <parameter key="horizon" value="2"/>
                <process expanded="true" height="371" width="321">
                  <operator activated="true" class="support_vector_machine" compatibility="5.2.006" expanded="true" height="112" name="SVM" width="90" x="115" y="30"/>
                  <connect from_port="training" to_op="SVM" to_port="training set"/>
                  <connect from_op="SVM" from_port="model" to_port="model"/>
                  <portSpacing port="source_training" spacing="0"/>
                  <portSpacing port="sink_model" spacing="0"/>
                  <portSpacing port="sink_through 1" spacing="0"/>
                </process>
                <process expanded="true" height="371" width="321">
                  <operator activated="true" class="apply_model" compatibility="5.2.006" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
                    <list key="application_parameters"/>
                  </operator>
                  <operator activated="true" class="series:forecasting_performance" compatibility="5.1.002" expanded="true" height="76" name="Performance" width="90" x="183" y="30">
                    <parameter key="horizon" value="1"/>
                  </operator>
                  <connect from_port="model" to_op="Apply Model" to_port="model"/>
                  <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
                  <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
                  <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
                  <portSpacing port="source_model" spacing="0"/>
                  <portSpacing port="source_test set" spacing="0"/>
                  <portSpacing port="source_through 1" spacing="0"/>
                  <portSpacing port="sink_averagable 1" spacing="0"/>
                  <portSpacing port="sink_averagable 2" spacing="0"/>
                </process>
              </operator>
              <operator activated="true" class="apply_model" compatibility="5.2.006" expanded="true" height="76" name="Apply Model (2)" width="90" x="313" y="120">
                <list key="application_parameters"/>
              </operator>
              <operator activated="true" class="generate_attributes" compatibility="5.2.006" expanded="true" height="76" name="Generate Attributes" width="90" x="447" y="120">
                <list key="function_descriptions">
                  <parameter key="Tag" value="param(&quot;Windowing&quot;, &quot;label_attribute&quot;)"/>
                </list>
              </operator>
              <connect from_port="input 1" to_op="Windowing" to_port="example set input"/>
              <connect from_port="input 2" to_op="Windowing (2)" to_port="example set input"/>
              <connect from_op="Windowing (2)" from_port="example set output" to_op="Apply Model (2)" to_port="unlabelled data"/>
              <connect from_op="Windowing" from_port="example set output" to_op="Validation" to_port="training"/>
              <connect from_op="Validation" from_port="model" to_op="Apply Model (2)" to_port="model"/>
              <connect from_op="Apply Model (2)" from_port="labelled data" to_op="Generate Attributes" to_port="example set input"/>
              <connect from_op="Generate Attributes" from_port="example set output" to_port="result 1"/>
              <portSpacing port="source_input 1" spacing="0"/>
              <portSpacing port="source_input 2" spacing="0"/>
              <portSpacing port="source_input 3" spacing="0"/>
              <portSpacing port="sink_performance" spacing="72"/>
              <portSpacing port="sink_result 1" spacing="0"/>
              <portSpacing port="sink_result 2" spacing="0"/>
            </process>
          </operator>
          <operator activated="true" class="append" compatibility="5.2.006" expanded="true" height="76" name="Append" width="90" x="581" y="30"/>
          <operator activated="true" class="rename_by_replacing" compatibility="5.2.006" expanded="true" height="76" name="Rename by Replacing" width="90" x="45" y="165">
            <parameter key="replace_what" value="^(.*)-.*$"/>
            <parameter key="replace_by" value="$1"/>
          </operator>
          <operator activated="true" class="rename" compatibility="5.2.006" expanded="true" height="76" name="Rename" width="90" x="179" y="165">
            <parameter key="old_name" value="prediction(label)"/>
            <parameter key="new_name" value="prediction"/>
            <list key="rename_additional_attributes"/>
          </operator>
          <operator activated="true" class="loop_examples" compatibility="5.2.006" expanded="true" height="94" name="Loop Examples" width="90" x="313" y="165">
            <process expanded="true" height="390" width="634">
              <operator activated="true" class="filter_example_range" compatibility="5.2.006" expanded="true" height="76" name="Filter Example Range" width="90" x="45" y="30">
                <parameter key="first_example" value="%{example}"/>
                <parameter key="last_example" value="%{example}"/>
              </operator>
              <operator activated="true" class="extract_macro" compatibility="5.2.006" expanded="true" height="60" name="Extract Macro" width="90" x="179" y="30">
                <parameter key="macro" value="tagValue"/>
                <parameter key="macro_type" value="data_value"/>
                <parameter key="attribute_name" value="Tag"/>
                <parameter key="example_index" value="1"/>
              </operator>
              <operator activated="true" class="generate_attributes" compatibility="5.2.006" expanded="true" height="76" name="Generate Attributes (2)" width="90" x="313" y="30">
                <list key="function_descriptions">
                  <parameter key="newAtt" value="(prediction-%{tagValue})/prediction*100"/>
                </list>
              </operator>
              <connect from_port="example set" to_op="Filter Example Range" to_port="example set input"/>
              <connect from_op="Filter Example Range" from_port="example set output" to_op="Extract Macro" to_port="example set"/>
              <connect from_op="Extract Macro" from_port="example set" to_op="Generate Attributes (2)" to_port="example set input"/>
              <connect from_op="Generate Attributes (2)" from_port="example set output" to_port="output 1"/>
              <portSpacing port="source_example set" spacing="0"/>
              <portSpacing port="sink_example set" spacing="0"/>
              <portSpacing port="sink_output 1" spacing="0"/>
              <portSpacing port="sink_output 2" spacing="0"/>
            </process>
          </operator>
          <operator activated="true" class="append" compatibility="5.2.006" expanded="true" height="76" name="Append (2)" width="90" x="447" y="165"/>
          <connect from_op="Generate Data" from_port="output" to_op="Select Attributes (2)" to_port="example set input"/>
          <connect from_op="Select Attributes (2)" from_port="example set output" to_op="Multiply" to_port="input"/>
          <connect from_op="Multiply" from_port="output 1" to_op="Loop Parameters" to_port="input 1"/>
          <connect from_op="Multiply" from_port="output 2" to_op="Loop Parameters" to_port="input 2"/>
          <connect from_op="Loop Parameters" from_port="result 1" to_op="Append" to_port="example set 1"/>
          <connect from_op="Append" from_port="merged set" to_op="Rename by Replacing" to_port="example set input"/>
          <connect from_op="Rename by Replacing" from_port="example set output" to_op="Rename" to_port="example set input"/>
          <connect from_op="Rename" from_port="example set output" to_op="Loop Examples" to_port="example set"/>
          <connect from_op="Loop Examples" from_port="output 1" to_op="Append (2)" to_port="example set 1"/>
          <connect from_op="Append (2)" from_port="merged set" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="126"/>
          <portSpacing port="sink_result 2" spacing="18"/>
        </process>
      </operator>
    </process>
  • Options
    qwertzqwertz Member Posts: 130 Contributor II

    When trying to understand the process I came across the "rename" operator.

    replace_what parameter is: ^(.*)-.*$
    replace_by parameter is: $1

    I found a link to this article here in the Forum on regular expressions:
    http://docs.oracle.com/javase/1.5.0/docs/api/java/util/regex/Pattern.html
    Furthermore, in RapidMiner Tutorial 4.6 (not in the current one) the same chapter on regular expressions can be found.

    However, I don't fully understand what it does:
    replace_what also works without the boundaries ^ and $. Why are they necessary then?
    Why is the first .* grouped by brackets?
    What does $1 stand for?
    What would be $2 and $3 mentioned in the operator description?


    *** puzzled ***  ???


    Regards
    Sachs
  • Options
    awchisholmawchisholm RapidMiner Certified Expert, Member Posts: 458 Unicorn
    Hello

    It's a regular expression that says something like...

    Start at the beginning of the line then match any number of arbitrary characters, a "-" and then any number of arbitrary characters until the end of the line. The brackets are a capturing group so all the arbitrary characters before the "-" are placed in it and this is referred to as $1 when the replacement is done. This has the effect of stripping out everything from the "-" inclusive to the end of the line for the attribute name.

    The beginning and end of line are probably not needed; it's often the way with regular expressions, there are many ways of solving a specific problem.

    regards

    Andrew
  • Options
    qwertzqwertz Member Posts: 130 Contributor II

    Ahhh, I wasn't aware that $1 refers to the term in brackets. Now it's much clearer.



    Just for information: I was wondering about the behaviour of the "Extract Macro" parameter.

    Use case 1: I put the name of an arbitrary attribute in there (e.g. v1) and the operator will return the examples of the attribute (e.g. 5861).

    row  predic  date  v1  v2  tag
    1  6211  2006  5861    87  v1
    2  6215  2007  6010    91  v1
    3    105  2006  5845  100  v2
    4    98    2007  5495    88  v2


    Use case 2 (see code in this thread): I put the name of attribute in there (e.g. tag) whose examples are the names of other attributes. In that case the examples of the other attributes will be returned (e.g. again 5861 instead of v1).

    row  predic  date  v1  v2  tag
    1  6211  2006  5861    87  v1
    2  6215  2007  6010    91  v1
    3    105  2006  5845  100  v2
    4    98    2007  5495    88  v2


    I was also suprised that the parameter "attribute name" was just only "Tag" and not %{Tag}.
    ...a little confusing for beginners but yet a handy feature...



    Bye
    Sachs
Sign In or Register to comment.