The RapidMiner community is on read-only mode until further notice. Technical support via cases will continue to work as is. For any urgent licensing related requests from Students/Faculty members, please use the Altair academic forum here.

Loop over pairs of attributes

BorcsaBorcsa Member Posts: 4 Newbie
Hi,

I have a dataset containing two sets of generally named attributes: (label_1,..., label_91) (pd_1,...,pd_91)
I would like to create a loop process inside which there will be a linear regression, with the label attribute being the respective looped label_ and the regression is using the previous day data for building the model (pd_ attribute)

So I want to loop pairs of attributes: label_1 and pd_1label_2 and pd_2; label_3 and pd_3 etc.
So that only the correct previous day data for the label attribute is used in each linear regression.

I cannot seem to find any solution, as the Loop Attributes only loops one attribute (e.g. the label_ here)
Nesting Loop Attributes would result in looping over all pd_ attributes for one label_, which is not needed as I only need one selected pd_ for each label_
Loop Attribute Subsets have the same problem, I don't need to have all combinations, only the selected ones

Could you suggest any solution?
Thank you very much for your help in advance!

Best Answer

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,525 RM Data Scientist
    Solution Accepted
    Hi,

    sorry for the delay, here is an example:

    <?xml version="1.0" encoding="UTF-8"?><process version="9.10.001"><br>  <context><br>    <input/><br>    <output/><br>    <macros/><br>  </context><br>  <operator activated="true" class="process" compatibility="9.10.001" expanded="true" name="Process"><br>    <parameter key="logverbosity" value="init"/><br>    <parameter key="random_seed" value="2001"/><br>    <parameter key="send_mail" value="never"/><br>    <parameter key="notification_email" value=""/><br>    <parameter key="process_duration_for_mail" value="30"/><br>    <parameter key="encoding" value="SYSTEM"/><br>    <process expanded="true"><br>      <operator activated="true" class="generate_data" compatibility="9.10.001" expanded="true" height="68" name="Generate Data" width="90" x="179" y="34"><br>        <parameter key="target_function" value="random"/><br>        <parameter key="number_examples" value="100"/><br>        <parameter key="number_of_attributes" value="5"/><br>        <parameter key="attributes_lower_bound" value="-10.0"/><br>        <parameter key="attributes_upper_bound" value="10.0"/><br>        <parameter key="gaussian_standard_deviation" value="10.0"/><br>        <parameter key="largest_radius" value="10.0"/><br>        <parameter key="use_local_random_seed" value="false"/><br>        <parameter key="local_random_seed" value="1992"/><br>        <parameter key="datamanagement" value="double_array"/><br>        <parameter key="data_management" value="auto"/><br>      </operator><br>      <operator activated="true" class="utility:create_exampleset" compatibility="9.10.001" expanded="true" height="68" name="Create ExampleSet" width="90" x="179" y="187"><br>        <parameter key="generator_type" value="comma separated text"/><br>        <parameter key="number_of_examples" value="100"/><br>        <parameter key="use_stepsize" value="false"/><br>        <list key="function_descriptions"/><br>        <parameter key="add_id_attribute" value="false"/><br>        <list key="numeric_series_configuration"/><br>        <list key="date_series_configuration"/><br>        <list key="date_series_configuration (interval)"/><br>        <parameter key="date_format" value="yyyy-MM-dd HH:mm:ss"/><br>        <parameter key="time_zone" value="SYSTEM"/><br>        <parameter key="input_csv_text" value="First Att, Second Att&#10;att1, att2&#10;att3, att4"/><br>        <parameter key="column_separator" value=","/><br>        <parameter key="parse_all_as_nominal" value="false"/><br>        <parameter key="decimal_point_character" value="."/><br>        <parameter key="trim_attribute_names" value="true"/><br>      </operator><br>      <operator activated="true" class="extract_macro" compatibility="9.10.001" expanded="true" height="68" name="Extract Macro" width="90" x="313" y="187"><br>        <parameter key="macro" value="nExa"/><br>        <parameter key="macro_type" value="number_of_examples"/><br>        <parameter key="statistics" value="average"/><br>        <parameter key="attribute_name" value=""/><br>        <list key="additional_macros"/><br>      </operator><br>      <operator activated="true" class="concurrency:loop" compatibility="9.10.001" expanded="true" height="103" name="Loop" width="90" x="447" y="34"><br>        <parameter key="number_of_iterations" value="%{nExa}"/><br>        <parameter key="iteration_macro" value="iteration"/><br>        <parameter key="reuse_results" value="true"/><br>        <parameter key="enable_parallel_execution" value="false"/><br>        <process expanded="true"><br>          <operator activated="true" class="extract_macro" compatibility="9.10.001" expanded="true" height="68" name="Extract Macro (2)" width="90" x="112" y="85"><br>            <parameter key="macro" value="firstAtt"/><br>            <parameter key="macro_type" value="data_value"/><br>            <parameter key="statistics" value="average"/><br>            <parameter key="attribute_name" value="First Att"/><br>            <parameter key="example_index" value="%{iteration}"/><br>            <list key="additional_macros"><br>              <parameter key="secondAtt" value="Second Att"/><br>            </list><br>          </operator><br>          <operator activated="true" class="delay" compatibility="9.10.001" expanded="true" height="103" name="Delay" width="90" x="246" y="34"><br>            <parameter key="delay" value="fixed"/><br>            <parameter key="delay_amount" value="0"/><br>            <parameter key="min_delay_amount" value="0"/><br>            <parameter key="max_delay_amount" value="1000"/><br>            <description align="center" color="transparent" colored="false" width="126">just to ensure exec order</description><br>          </operator><br>          <operator activated="true" class="generate_attributes" compatibility="9.10.001" expanded="true" height="82" name="Generate Attributes" width="90" x="380" y="34"><br>            <list key="function_descriptions"><br>              <parameter key="delta_%{firstAtt}_%{secondAtt}" value="eval(%{firstAtt})-eval(%{secondAtt})"/><br>            </list><br>            <parameter key="keep_all" value="true"/><br>          </operator><br>          <connect from_port="input 1" to_op="Delay" to_port="through 1"/><br>          <connect from_port="input 2" to_op="Extract Macro (2)" to_port="example set"/><br>          <connect from_op="Extract Macro (2)" from_port="example set" to_op="Delay" to_port="through 2"/><br>          <connect from_op="Delay" from_port="through 1" to_op="Generate Attributes" to_port="example set input"/><br>          <connect from_op="Delay" from_port="through 2" to_port="output 2"/><br>          <connect from_op="Generate Attributes" from_port="example set output" to_port="output 1"/><br>          <portSpacing port="source_input 1" spacing="0"/><br>          <portSpacing port="source_input 2" spacing="0"/><br>          <portSpacing port="source_input 3" spacing="0"/><br>          <portSpacing port="sink_output 1" spacing="0"/><br>          <portSpacing port="sink_output 2" spacing="0"/><br>          <portSpacing port="sink_output 3" spacing="0"/><br>          <description align="center" color="yellow" colored="false" height="67" resized="false" width="126" x="412" y="134">now we can work with firstAtt and secondAtt</description><br>        </process><br>      </operator><br>      <connect from_op="Generate Data" from_port="output" to_op="Loop" to_port="input 1"/><br>      <connect from_op="Create ExampleSet" from_port="output" to_op="Extract Macro" to_port="example set"/><br>      <connect from_op="Extract Macro" from_port="example set" to_op="Loop" to_port="input 2"/><br>      <connect from_op="Loop" from_port="output 1" to_port="result 1"/><br>      <portSpacing port="source_input 1" spacing="0"/><br>      <portSpacing port="sink_result 1" spacing="0"/><br>      <portSpacing port="sink_result 2" spacing="0"/><br>    </process><br>  </operator><br></process><br><br>
    Cheers,
    Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany

Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,525 RM Data Scientist
    Hi,
    i think you need to create yourself an example set with the list of tuples (this might be auto generated by using your table). you can then use the normal loop and always extract the current tuple as macros.
    Let me know if you need an example for it.

    BR,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • BorcsaBorcsa Member Posts: 4 Newbie
    Hi, 

    Thank you!
    Yes I would really appreciate an example.
    Especially that I couldn't find how to make tuples.

    Kind regards, 
    Boróka 
  • BorcsaBorcsa Member Posts: 4 Newbie
    Hi,

    Can you please help me how to generate a list of tuples from my attributes? I cannot find a solution for this.

    I am also wondering if it will be a problem within the loop operator that I want to use the attributes separately, as one of them will be the label attribute.

    Thank you and kind regards.
  • BorcsaBorcsa Member Posts: 4 Newbie
    Thank you!
Sign In or Register to comment.