[Solved] Generate new attributes with variable other attributes

qwertzqwertz Member Posts: 130  Maven
edited November 2018 in Help

Dear all,

I have a tricky question again: In the attached code I run "loop parameter" operator a couple of times. When running the loop for attX [X = 1;2;...] I would like to create a new attribute which is the result of a calculation.

The calculation (to be done for all examples of an attributes) shall include the current attX as well as the prediction made for attX for each specific loop.
(in the attached code I took a multiplication for example)


Any ideas for a good approach?


All the best
Sachs


<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.008">
 <context>
   <input/>
   <output/>
   <macros/>
 </context>
 <operator activated="true" class="process" compatibility="5.2.008" expanded="true" name="Process">
   <process expanded="true" height="407" width="701">
     <operator activated="true" class="generate_data" compatibility="5.2.008" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30"/>
     <operator activated="true" class="multiply" compatibility="5.2.008" expanded="true" height="94" name="Multiply" width="90" x="179" y="30"/>
     <operator activated="true" class="loop_parameters" compatibility="5.2.008" expanded="true" height="94" name="Loop Parameters" width="90" x="313" y="30">
       <list key="parameters">
         <parameter key="Windowing.label_attribute" value="att1,att2"/>
       </list>
       <process expanded="true" height="431" width="634">
         <operator activated="true" class="series:windowing" compatibility="5.1.002" expanded="true" height="76" name="Windowing (2)" width="90" x="45" y="165">
           <parameter key="window_size" value="1"/>
         </operator>
         <operator activated="true" class="series:windowing" compatibility="5.1.002" expanded="true" height="76" name="Windowing" width="90" x="45" y="30">
           <parameter key="horizon" value="1"/>
           <parameter key="window_size" value="1"/>
           <parameter key="create_label" value="true"/>
           <parameter key="label_attribute" value="att2"/>
         </operator>
         <operator activated="true" class="series:sliding_window_validation" compatibility="5.1.002" expanded="true" height="112" name="Validation" width="90" x="179" y="30">
           <parameter key="training_window_width" value="20"/>
           <parameter key="training_window_step_size" value="10"/>
           <parameter key="test_window_width" value="20"/>
           <parameter key="horizon" value="5"/>
           <process expanded="true" height="410" width="292">
             <operator activated="true" class="neural_net" compatibility="5.2.008" expanded="true" height="76" name="Neural Net" width="90" x="105" y="30">
               <list key="hidden_layers"/>
               <parameter key="training_cycles" value="5"/>
             </operator>
             <connect from_port="training" to_op="Neural Net" to_port="training set"/>
             <connect from_op="Neural Net" from_port="model" to_port="model"/>
             <portSpacing port="source_training" spacing="0"/>
             <portSpacing port="sink_model" spacing="0"/>
             <portSpacing port="sink_through 1" spacing="0"/>
           </process>
           <process expanded="true" height="410" width="292">
             <operator activated="true" class="apply_model" compatibility="5.2.008" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
               <list key="application_parameters"/>
             </operator>
             <operator activated="true" class="series:forecasting_performance" compatibility="5.1.002" expanded="true" height="76" name="Performance" width="90" x="173" y="30">
               <parameter key="horizon" value="1"/>
             </operator>
             <connect from_port="model" to_op="Apply Model" to_port="model"/>
             <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
             <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
             <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
             <portSpacing port="source_model" spacing="0"/>
             <portSpacing port="source_test set" spacing="0"/>
             <portSpacing port="source_through 1" spacing="0"/>
             <portSpacing port="sink_averagable 1" spacing="0"/>
             <portSpacing port="sink_averagable 2" spacing="0"/>
           </process>
         </operator>
         <operator activated="true" class="apply_model" compatibility="5.2.008" expanded="true" height="76" name="Apply Model (2)" width="90" x="313" y="120">
           <list key="application_parameters"/>
         </operator>
         <operator activated="true" class="generate_attributes" compatibility="5.2.008" expanded="true" height="76" name="Generate Attributes" width="90" x="447" y="120">
           <list key="function_descriptions">
             <parameter key="Tag" value="param(&quot;Windowing&quot;, &quot;label_attribute&quot;)"/>
             <parameter key="New" value="VALUE(prediction)*VALUE(label_attribute)"/>
           </list>
         </operator>
         <connect from_port="input 1" to_op="Windowing" to_port="example set input"/>
         <connect from_port="input 2" to_op="Windowing (2)" to_port="example set input"/>
         <connect from_op="Windowing (2)" from_port="example set output" to_op="Apply Model (2)" to_port="unlabelled data"/>
         <connect from_op="Windowing" from_port="example set output" to_op="Validation" to_port="training"/>
         <connect from_op="Validation" from_port="model" to_op="Apply Model (2)" to_port="model"/>
         <connect from_op="Apply Model (2)" from_port="labelled data" to_op="Generate Attributes" to_port="example set input"/>
         <connect from_op="Generate Attributes" from_port="example set output" to_port="result 1"/>
         <portSpacing port="source_input 1" spacing="0"/>
         <portSpacing port="source_input 2" spacing="0"/>
         <portSpacing port="source_input 3" spacing="0"/>
         <portSpacing port="sink_performance" spacing="72"/>
         <portSpacing port="sink_result 1" spacing="0"/>
         <portSpacing port="sink_result 2" spacing="0"/>
       </process>
     </operator>
     <connect from_op="Generate Data" from_port="output" to_op="Multiply" to_port="input"/>
     <connect from_op="Multiply" from_port="output 1" to_op="Loop Parameters" to_port="input 1"/>
     <connect from_op="Multiply" from_port="output 2" to_op="Loop Parameters" to_port="input 2"/>
     <connect from_op="Loop Parameters" from_port="result 1" to_port="result 1"/>
     <portSpacing port="source_input 1" spacing="0"/>
     <portSpacing port="sink_result 1" spacing="0"/>
     <portSpacing port="sink_result 2" spacing="0"/>
   </process>
 </operator>
</process>

Answers

  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869   Unicorn
    Hi Sachs,

    based on your approach, it becomes much easier if you loop over a macro value instead of a parameter in Windowing.
    In Generate Attributes, to use an attribute in an expression its name must not contain anything but alpha-numeric chars and the underscore. Thus you have to rename the attribute "prediction(label)" to sth like "prediction". Then it's just a matter of correct macro usage in Generate Attributes to get the desired results. Please see the attached process for details.

    Best,
      ~Marius
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.2.009">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.2.009" expanded="true" name="Process">
        <process expanded="true" height="493" width="433">
          <operator activated="true" class="generate_data" compatibility="5.2.009" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30"/>
          <operator activated="true" class="multiply" compatibility="5.2.009" expanded="true" height="94" name="Multiply" width="90" x="179" y="30"/>
          <operator activated="true" class="loop_parameters" compatibility="5.2.009" expanded="true" height="94" name="Loop Parameters" width="90" x="313" y="30">
            <list key="parameters">
              <parameter key="Set Macro.value" value="att1,att2"/>
            </list>
            <process expanded="true" height="493" width="1016">
              <operator activated="true" class="set_macro" compatibility="5.2.009" expanded="true" height="76" name="Set Macro" width="90" x="45" y="30">
                <parameter key="macro" value="label_attribute"/>
                <parameter key="value" value="att2"/>
              </operator>
              <operator activated="true" class="series:windowing" compatibility="5.2.000" expanded="true" height="76" name="Windowing (2)" width="90" x="246" y="165">
                <parameter key="window_size" value="1"/>
                <parameter key="create_label" value="true"/>
                <parameter key="label_attribute" value="%{label_attribute}"/>
              </operator>
              <operator activated="true" class="series:windowing" compatibility="5.2.000" expanded="true" height="76" name="Windowing" width="90" x="246" y="30">
                <parameter key="horizon" value="1"/>
                <parameter key="window_size" value="1"/>
                <parameter key="create_label" value="true"/>
                <parameter key="label_attribute" value="%{label_attribute}"/>
              </operator>
              <operator activated="true" class="series:sliding_window_validation" compatibility="5.2.000" expanded="true" height="112" name="Validation" width="90" x="380" y="30">
                <parameter key="training_window_width" value="20"/>
                <parameter key="training_window_step_size" value="10"/>
                <parameter key="test_window_width" value="20"/>
                <parameter key="horizon" value="5"/>
                <process expanded="true" height="410" width="292">
                  <operator activated="true" class="neural_net" compatibility="5.2.009" expanded="true" height="76" name="Neural Net" width="90" x="105" y="30">
                    <list key="hidden_layers"/>
                    <parameter key="training_cycles" value="5"/>
                  </operator>
                  <connect from_port="training" to_op="Neural Net" to_port="training set"/>
                  <connect from_op="Neural Net" from_port="model" to_port="model"/>
                  <portSpacing port="source_training" spacing="0"/>
                  <portSpacing port="sink_model" spacing="0"/>
                  <portSpacing port="sink_through 1" spacing="0"/>
                </process>
                <process expanded="true" height="410" width="292">
                  <operator activated="true" class="apply_model" compatibility="5.2.009" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
                    <list key="application_parameters"/>
                  </operator>
                  <operator activated="true" class="series:forecasting_performance" compatibility="5.2.000" expanded="true" height="76" name="Performance" width="90" x="173" y="30">
                    <parameter key="horizon" value="1"/>
                  </operator>
                  <connect from_port="model" to_op="Apply Model" to_port="model"/>
                  <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
                  <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
                  <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
                  <portSpacing port="source_model" spacing="0"/>
                  <portSpacing port="source_test set" spacing="0"/>
                  <portSpacing port="source_through 1" spacing="0"/>
                  <portSpacing port="sink_averagable 1" spacing="0"/>
                  <portSpacing port="sink_averagable 2" spacing="0"/>
                </process>
              </operator>
              <operator activated="true" breakpoints="after" class="apply_model" compatibility="5.2.009" expanded="true" height="76" name="Apply Model (2)" width="90" x="514" y="120">
                <list key="application_parameters"/>
              </operator>
              <operator activated="true" class="rename" compatibility="5.2.009" expanded="true" height="76" name="Rename" width="90" x="648" y="120">
                <parameter key="old_name" value="prediction(label)"/>
                <parameter key="new_name" value="prediction"/>
                <list key="rename_additional_attributes"/>
              </operator>
              <operator activated="true" class="generate_attributes" compatibility="5.2.009" expanded="true" height="76" name="Generate Attributes" width="90" x="782" y="120">
                <list key="function_descriptions">
                  <parameter key="Tag_%{label_attribute}" value="&quot;%{label_attribute}&quot;"/>
                  <parameter key="New_%{label_attribute}" value="prediction*label"/>
                </list>
              </operator>
              <connect from_port="input 1" to_op="Set Macro" to_port="through 1"/>
              <connect from_port="input 2" to_op="Windowing (2)" to_port="example set input"/>
              <connect from_op="Set Macro" from_port="through 1" to_op="Windowing" to_port="example set input"/>
              <connect from_op="Windowing (2)" from_port="example set output" to_op="Apply Model (2)" to_port="unlabelled data"/>
              <connect from_op="Windowing" from_port="example set output" to_op="Validation" to_port="training"/>
              <connect from_op="Validation" from_port="model" to_op="Apply Model (2)" to_port="model"/>
              <connect from_op="Apply Model (2)" from_port="labelled data" to_op="Rename" to_port="example set input"/>
              <connect from_op="Rename" from_port="example set output" to_op="Generate Attributes" to_port="example set input"/>
              <connect from_op="Generate Attributes" from_port="example set output" to_port="result 1"/>
              <portSpacing port="source_input 1" spacing="0"/>
              <portSpacing port="source_input 2" spacing="126"/>
              <portSpacing port="source_input 3" spacing="0"/>
              <portSpacing port="sink_performance" spacing="72"/>
              <portSpacing port="sink_result 1" spacing="0"/>
              <portSpacing port="sink_result 2" spacing="0"/>
            </process>
          </operator>
          <connect from_op="Generate Data" from_port="output" to_op="Multiply" to_port="input"/>
          <connect from_op="Multiply" from_port="output 1" to_op="Loop Parameters" to_port="input 1"/>
          <connect from_op="Multiply" from_port="output 2" to_op="Loop Parameters" to_port="input 2"/>
          <connect from_op="Loop Parameters" from_port="result 1" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
  • qwertzqwertz Member Posts: 130  Maven


    Thanks man! Without your help I would probably never have done it!

    What I find remarkable is, that in the second "windowing" operator you set the "create label" property also, though "apply model" expects unlabeled data. I was not aware that this works at all and that it works without having influence on the model.


    You really made my day! Thanks again
    Sachs
  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869   Unicorn
    It's the same what's happening also in the X-Validation. The label is not at all used by Apply Model, you can however use it in later operators. The Performance operator for example could not do anything without knowing the true label in addition to the prediction.

    Best, Marius
Sign In or Register to comment.