Options

"Loop output"

XodarapXodarap Member Posts: 6 Contributor II
edited June 2019 in Help
I have a Loop Labels, which works fine. For each label, I do a regression and calculate the performance vector. The trouble I'm having is that I want the output to be a table like:

label name | regression info | performance vector

I have two questions:
1. How can I figure out which one was the label for that loop? I tried using a "Select Attribute" to find the one called "label", but this didn't seem to work.
2. Supposing I can find out the label, how can I flatten this into a table structure?

I wasn't able to find anything in the documentation about loops.

Answers

  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    unfortunately the Loop Labels operator lacks some functionality that you would need for your task. But it can be replaced easily with a Loop Attributes operator, which will give you the label name as a macro. You must only switch the roles inside the loop operator manually.

    You might then use the log operator for outputting everything to a table. With the Log to Data operator, you might this again convert to an example set if you need it.

    Only thing I don't know, what the regression info is?

    Greetings,
      Sebastian
  • Options
    XodarapXodarap Member Posts: 6 Contributor II
    Thanks! Using a plain attribute loop helped.

    The regression info I would like to store is the model output, i.e. the thing that's like "kx^n + cy^z + ...". So a row of the table would be like:

    label | formula to calculate label | error

    Is this possible to do?
  • Options
    XodarapXodarap Member Posts: 6 Contributor II
    I take it back, what I actually want is the list of attributes that my optimizer chooses  ;D. I have attached the xml to my setup; you can see in the log what I want is some way of saying the performance of the optimizer, the name of the label, and also which features were selected as most relevant. I thought of putting in a select by weights and then trying to log the output of that, but that didn't seem to work either. Is there any way I can accomplish this?

    <operator activated="true" class="loop_attributes" expanded="true" height="60" name="Loop Attributes" width="90" x="179" y="30">
            <parameter key="attribute_filter_type" value="subset"/>
            <parameter key="attributes" value="%_Close Enc Same Day"/>
            <parameter key="iteration_macro" value="a"/>
            <process expanded="true" height="588" width="712">
              <operator activated="true" class="set_role" expanded="true" height="76" name="Set Role" width="90" x="45" y="30">
                <parameter key="name" value="%{a}"/>
                <parameter key="target_role" value="label"/>
              </operator>
              <operator activated="true" class="filter_examples" expanded="true" height="76" name="Filter Examples" width="90" x="179" y="30">
                <parameter key="condition_class" value="no_missing_labels"/>
              </operator>
              <operator activated="true" class="replace_missing_values" expanded="true" height="94" name="Replace Missing Values" width="90" x="313" y="30">
                <list key="columns"/>
              </operator>
              <operator activated="true" class="optimize_selection" expanded="true" height="94" name="Optimize Selection" width="90" x="45" y="165">
                <parameter key="limit_number_of_generations" value="true"/>
                <parameter key="maximum_number_of_generations" value="1"/>
                <process expanded="true" height="588" width="712">
                  <operator activated="true" class="x_validation" expanded="true" height="112" name="Validation" width="90" x="173" y="54">
                    <parameter key="number_of_validations" value="2"/>
                    <process expanded="true" height="588" width="331">
                      <operator activated="true" class="linear_regression" expanded="true" height="76" name="Linear Regression" width="90" x="96" y="32"/>
                      <connect from_port="training" to_op="Linear Regression" to_port="training set"/>
                      <connect from_op="Linear Regression" from_port="model" to_port="model"/>
                      <portSpacing port="source_training" spacing="0"/>
                      <portSpacing port="sink_model" spacing="0"/>
                      <portSpacing port="sink_through 1" spacing="0"/>
                    </process>
                    <process expanded="true" height="588" width="331">
                      <operator activated="true" class="apply_model" expanded="true" height="76" name="Apply Model" width="90" x="50" y="39">
                        <list key="application_parameters"/>
                      </operator>
                      <operator activated="true" class="performance_regression" expanded="true" height="76" name="Performance" width="90" x="176" y="36">
                        <parameter key="root_mean_squared_error" value="true"/>
                      </operator>
                      <connect from_port="model" to_op="Apply Model" to_port="model"/>
                      <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
                      <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
                      <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
                      <portSpacing port="source_model" spacing="0"/>
                      <portSpacing port="source_test set" spacing="0"/>
                      <portSpacing port="source_through 1" spacing="0"/>
                      <portSpacing port="sink_averagable 1" spacing="0"/>
                      <portSpacing port="sink_averagable 2" spacing="0"/>
                    </process>
                  </operator>
                  <connect from_port="example set" to_op="Validation" to_port="training"/>
                  <connect from_op="Validation" from_port="averagable 1" to_port="performance"/>
                  <portSpacing port="source_example set" spacing="0"/>
                  <portSpacing port="source_through 1" spacing="0"/>
                  <portSpacing port="sink_performance" spacing="0"/>
                </process>
              </operator>
              <operator activated="true" class="provide_macro_as_log_value" expanded="true" height="112" name="Provide Macro as Log Value" width="90" x="380" y="165">
                <parameter key="macro_name" value="%{a}"/>
              </operator>
              <operator activated="true" class="log" expanded="true" height="112" name="Log" width="90" x="581" y="120">
                <list key="log">
                  <parameter key="rmse" value="operator.Optimize Selection.value.performance"/>
                  <parameter key="Weights" value="operator.Optimize Selection.value.feature_names"/>
                  <parameter key="Feature" value="operator.Provide Macro as Log Value.parameter.macro_name"/>
                </list>
              </operator>
              <connect from_port="example set" to_op="Set Role" to_port="example set input"/>
              <connect from_op="Set Role" from_port="example set output" to_op="Filter Examples" to_port="example set input"/>
              <connect from_op="Filter Examples" from_port="example set output" to_op="Replace Missing Values" to_port="example set input"/>
              <connect from_op="Replace Missing Values" from_port="example set output" to_op="Optimize Selection" to_port="example set in"/>
              <connect from_op="Optimize Selection" from_port="example set out" to_op="Provide Macro as Log Value" to_port="through 1"/>
              <connect from_op="Optimize Selection" from_port="weights" to_op="Provide Macro as Log Value" to_port="through 2"/>
              <connect from_op="Optimize Selection" from_port="performance" to_op="Provide Macro as Log Value" to_port="through 3"/>
              <connect from_op="Provide Macro as Log Value" from_port="through 1" to_op="Log" to_port="through 1"/>
              <connect from_op="Provide Macro as Log Value" from_port="through 2" to_op="Log" to_port="through 2"/>
              <connect from_op="Provide Macro as Log Value" from_port="through 3" to_op="Log" to_port="through 3"/>
              <connect from_op="Log" from_port="through 1" to_port="example set"/>
              <portSpacing port="source_example set" spacing="0"/>
              <portSpacing port="sink_example set" spacing="0"/>
            </process>
          </operator>
  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    I thought about this now for quite a while and I can't imagine of a simple solution using only operators. Of course, if you don't need it in a Log, you could simply use the Reporting Extension to write it to Excel for example. This could be relatively easily accomplished.

    But here's a way using a simple script solving the problem:
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.0">
      <context>
        <input>
          <location/>
        </input>
        <output>
          <location/>
          <location/>
        </output>
        <macros/>
      </context>
      <operator activated="true" class="process" expanded="true" name="Process">
        <process expanded="true" height="422" width="567">
          <operator activated="true" class="generate_data" expanded="true" height="60" name="Generate Data" width="90" x="45" y="75"/>
          <operator activated="true" class="execute_script" expanded="true" height="76" name="Execute Script" width="90" x="179" y="75">
            <parameter key="script" value="operator.getProcess().getMacroHandler().addMacro(&quot;result&quot;, input[0].toResultString());"/>
          </operator>
          <operator activated="true" class="provide_macro_as_log_value" expanded="true" height="76" name="Provide Macro as Log Value" width="90" x="313" y="75">
            <parameter key="macro_name" value="result"/>
          </operator>
          <operator activated="true" class="log" expanded="true" height="76" name="Log" width="90" x="447" y="75">
            <list key="log">
              <parameter key="result" value="operator.Provide Macro as Log Value.value.macro_value"/>
            </list>
          </operator>
          <connect from_op="Generate Data" from_port="output" to_op="Execute Script" to_port="input 1"/>
          <connect from_op="Execute Script" from_port="output 1" to_op="Provide Macro as Log Value" to_port="through 1"/>
          <connect from_op="Provide Macro as Log Value" from_port="through 1" to_op="Log" to_port="through 1"/>
          <connect from_op="Log" from_port="through 1" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="36"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    The script operator will take the result string of the input object and add it as an macro called result. This is made accessible for the log operator using the Provide Macro as Log Value and finally added to the log using the Log  operator.

    Greetings,
      Sebastian
  • Options
    XodarapXodarap Member Posts: 6 Contributor II
    Awesome, thanks. This is exactly what I need.
Sign In or Register to comment.