How to compare 2 examples using loop in Execute Script operator?

sp_IQsp_IQ Member Posts: 3 Newbie
edited June 2020 in Help
I want to compare 2 examples in one attribute named "ID" and if they are equal then I have to compare another set of examples under another attribute named "Date". If they match, then I need to rank them with same rank starting from 1 in ascending order. If the examples do not match in "Date" attribute then I need to rank them as 1,2 & so on. How to achieve this using Execute Script operator? Please help.

Answers

  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    I think you can do this with a combination of Sort and Generate ID, but you may also want to check out the Sort (Advanced) from the Jackhammer extension.  The old Finance extension also has an operator for Rank, but that extension isn't being supported anymore.
    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • sp_IQsp_IQ Member Posts: 3 Newbie
    The data is already sorted but I am unable to generate the desired output. I am new to Rapid miner. It will be helpful if you can share me the process. Thanks in advance.
  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,503 RM Data Scientist
    Hi @Sp_Iq,
    i think there is no reason to use a script here. Attached is a process which goes into the direction you want to do.  Not sure if it is 100% what you need.

    Cheers,

    Martin

    <?xml version="1.0" encoding="UTF-8"?><process version="9.7.000"><br>  <context><br>    <input/><br>    <output/><br>    <macros/><br>  </context><br>  <operator activated="true" class="process" compatibility="9.7.000" expanded="true" name="Process"><br>    <parameter key="logverbosity" value="init"/><br>    <parameter key="random_seed" value="2001"/><br>    <parameter key="send_mail" value="never"/><br>    <parameter key="notification_email" value=""/><br>    <parameter key="process_duration_for_mail" value="30"/><br>    <parameter key="encoding" value="SYSTEM"/><br>    <process expanded="true"><br>      <operator activated="true" class="utility:create_exampleset" compatibility="9.7.000" expanded="true" height="68" name="Create ExampleSet" width="90" x="45" y="85"><br>        <parameter key="generator_type" value="attribute functions"/><br>        <parameter key="number_of_examples" value="100"/><br>        <parameter key="use_stepsize" value="false"/><br>        <list key="function_descriptions"><br>          <parameter key="date" value="date_add(date_now(),round(rand())*50,DATE_UNIT_DAY)"/><br>          <parameter key="Generated ID" value="id%10"/><br>          <parameter key="random" value="rand()"/><br>        </list><br>        <parameter key="add_id_attribute" value="true"/><br>        <list key="numeric_series_configuration"/><br>        <list key="date_series_configuration"/><br>        <list key="date_series_configuration (interval)"/><br>        <parameter key="date_format" value="yyyy-MM-dd HH:mm:ss"/><br>        <parameter key="time_zone" value="SYSTEM"/><br>        <parameter key="column_separator" value=","/><br>        <parameter key="parse_all_as_nominal" value="false"/><br>        <parameter key="decimal_point_character" value="."/><br>        <parameter key="trim_attribute_names" value="true"/><br>      </operator><br>      <operator activated="true" class="remove_duplicates" compatibility="9.7.000" expanded="true" height="103" name="Remove Duplicates" width="90" x="179" y="85"><br>        <parameter key="attribute_filter_type" value="subset"/><br>        <parameter key="attribute" value=""/><br>        <parameter key="attributes" value="Generated ID|date"/><br>        <parameter key="use_except_expression" value="false"/><br>        <parameter key="value_type" value="attribute_value"/><br>        <parameter key="use_value_type_exception" value="false"/><br>        <parameter key="except_value_type" value="time"/><br>        <parameter key="block_type" value="attribute_block"/><br>        <parameter key="use_block_type_exception" value="false"/><br>        <parameter key="except_block_type" value="value_matrix_row_start"/><br>        <parameter key="invert_selection" value="false"/><br>        <parameter key="include_special_attributes" value="false"/><br>        <parameter key="treat_missing_values_as_duplicates" value="false"/><br>      </operator><br>      <operator activated="true" class="concurrency:join" compatibility="9.7.000" expanded="true" height="82" name="Join" width="90" x="380" y="85"><br>        <parameter key="remove_double_attributes" value="true"/><br>        <parameter key="join_type" value="inner"/><br>        <parameter key="use_id_attribute_as_key" value="false"/><br>        <list key="key_attributes"><br>          <parameter key="Generated ID" value="Generated ID"/><br>          <parameter key="date" value="date"/><br>        </list><br>        <parameter key="keep_both_join_attributes" value="false"/><br>      </operator><br>      <operator activated="true" class="sort" compatibility="9.7.000" expanded="true" height="82" name="Sort" width="90" x="514" y="85"><br>        <parameter key="attribute_name" value="random"/><br>        <parameter key="sorting_direction" value="decreasing"/><br>      </operator><br>      <connect from_op="Create ExampleSet" from_port="output" to_op="Remove Duplicates" to_port="example set input"/><br>      <connect from_op="Remove Duplicates" from_port="example set output" to_op="Join" to_port="left"/><br>      <connect from_op="Remove Duplicates" from_port="duplicates" to_op="Join" to_port="right"/><br>      <connect from_op="Join" from_port="join" to_op="Sort" to_port="example set input"/><br>      <connect from_op="Sort" from_port="example set output" to_port="result 1"/><br>      <portSpacing port="source_input 1" spacing="0"/><br>      <portSpacing port="sink_result 1" spacing="0"/><br>      <portSpacing port="sink_result 2" spacing="0"/><br>    </process><br>  </operator><br></process><br><br>




    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • sp_IQsp_IQ Member Posts: 3 Newbie
    Hi @mschmitz,

    I couldn't find the attached process. There's only 1 line in the attachment. Can you please attach it again. Thank you.
  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,503 RM Data Scientist
    <?xml version="1.0" encoding="UTF-8"?><process version="9.7.000">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="9.7.000" expanded="true" name="Process">
        <parameter key="logverbosity" value="init"/>
        <parameter key="random_seed" value="2001"/>
        <parameter key="send_mail" value="never"/>
        <parameter key="notification_email" value=""/>
        <parameter key="process_duration_for_mail" value="30"/>
        <parameter key="encoding" value="SYSTEM"/>
        <process expanded="true">
          <operator activated="true" class="utility:create_exampleset" compatibility="9.7.000" expanded="true" height="68" name="Create ExampleSet" width="90" x="45" y="85">
            <parameter key="generator_type" value="attribute functions"/>
            <parameter key="number_of_examples" value="100"/>
            <parameter key="use_stepsize" value="false"/>
            <list key="function_descriptions">
              <parameter key="date" value="date_add(date_now(),round(rand())*50,DATE_UNIT_DAY)"/>
              <parameter key="Generated ID" value="id%10"/>
              <parameter key="random" value="rand()"/>
            </list>
            <parameter key="add_id_attribute" value="true"/>
            <list key="numeric_series_configuration"/>
            <list key="date_series_configuration"/>
            <list key="date_series_configuration (interval)"/>
            <parameter key="date_format" value="yyyy-MM-dd HH:mm:ss"/>
            <parameter key="time_zone" value="SYSTEM"/>
            <parameter key="column_separator" value=","/>
            <parameter key="parse_all_as_nominal" value="false"/>
            <parameter key="decimal_point_character" value="."/>
            <parameter key="trim_attribute_names" value="true"/>
          </operator>
          <operator activated="true" class="remove_duplicates" compatibility="9.7.000" expanded="true" height="103" name="Remove Duplicates" width="90" x="179" y="85">
            <parameter key="attribute_filter_type" value="subset"/>
            <parameter key="attribute" value=""/>
            <parameter key="attributes" value="Generated ID|date"/>
            <parameter key="use_except_expression" value="false"/>
            <parameter key="value_type" value="attribute_value"/>
            <parameter key="use_value_type_exception" value="false"/>
            <parameter key="except_value_type" value="time"/>
            <parameter key="block_type" value="attribute_block"/>
            <parameter key="use_block_type_exception" value="false"/>
            <parameter key="except_block_type" value="value_matrix_row_start"/>
            <parameter key="invert_selection" value="false"/>
            <parameter key="include_special_attributes" value="false"/>
            <parameter key="treat_missing_values_as_duplicates" value="false"/>
          </operator>
          <operator activated="true" class="concurrency:join" compatibility="9.7.000" expanded="true" height="82" name="Join" width="90" x="380" y="85">
            <parameter key="remove_double_attributes" value="true"/>
            <parameter key="join_type" value="inner"/>
            <parameter key="use_id_attribute_as_key" value="false"/>
            <list key="key_attributes">
              <parameter key="Generated ID" value="Generated ID"/>
              <parameter key="date" value="date"/>
            </list>
            <parameter key="keep_both_join_attributes" value="false"/>
          </operator>
          <operator activated="true" class="sort" compatibility="9.7.000" expanded="true" height="82" name="Sort" width="90" x="514" y="85">
            <parameter key="attribute_name" value="random"/>
            <parameter key="sorting_direction" value="decreasing"/>
          </operator>
          <connect from_op="Create ExampleSet" from_port="output" to_op="Remove Duplicates" to_port="example set input"/>
          <connect from_op="Remove Duplicates" from_port="example set output" to_op="Join" to_port="left"/>
          <connect from_op="Remove Duplicates" from_port="duplicates" to_op="Join" to_port="right"/>
          <connect from_op="Join" from_port="join" to_op="Sort" to_port="example set input"/>
          <connect from_op="Sort" from_port="example set output" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>


    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.