The RapidMiner community is on read-only mode until further notice. Technical support via cases will continue to work as is. For any urgent licensing related requests from Students/Faculty members, please use the Altair academic forum here.

Order by two columns

bea11005bea11005 Member Posts: 20 Maven
edited December 2018 in Help

Hi!

I have this dataset

userid     time

1             28

1             29

2             34

3             2

1             5

2             6

 

I need to order this data by user and then by time so the desired result is

user         time

1               5

1               28

1               29

2               6

2               34

3                2

 

How can I do it? It doesn't work by using 2 sorts in any order.....

Best Answer

  • bea11005bea11005 Member Posts: 20 Maven
    Solution Accepted

    I've found the solution: it works by ordering in the other way: first by time and then by user. :)

Answers

  • Telcontar120Telcontar120 RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn

    There is a very nice multi-attribute Sort operator in the Jackhammer extension in the Marketplace. 

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • Pavithra_RaoPavithra_Rao Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 123 RM Data Scientist

    Hi,

     

    You would need a combination of Sort, Group into collection and loop collection operator to achieve this task.

     

    Here is the XML code for the sample process for the same.

    <?xml version="1.0" encoding="UTF-8"?><process version="8.1.000">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.1.000" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="operator_toolbox:create_exampleset_from_doc" compatibility="0.9.000" expanded="true" height="68" name="Create ExampleSet" width="90" x="112" y="34">
    <parameter key="Input Csv" value="userid,time&#10;1,28&#10;1,29&#10;2,34&#10;3,2&#10;1,5&#10;2,6"/>
    </operator>
    <operator activated="true" class="sort" compatibility="8.1.000" expanded="true" height="82" name="Sort" width="90" x="246" y="34">
    <parameter key="attribute_name" value="userid"/>
    </operator>
    <operator activated="true" class="numerical_to_polynominal" compatibility="8.1.000" expanded="true" height="82" name="Numerical to Polynominal" width="90" x="380" y="34">
    <parameter key="attribute_filter_type" value="single"/>
    <parameter key="attribute" value="userid"/>
    </operator>
    <operator activated="true" class="set_role" compatibility="8.1.000" expanded="true" height="82" name="Set Role" width="90" x="514" y="34">
    <parameter key="attribute_name" value="userid"/>
    <parameter key="target_role" value="id"/>
    <list key="set_additional_roles"/>
    </operator>
    <operator activated="true" class="operator_toolbox:group_into_collection" compatibility="0.9.000" expanded="true" height="82" name="Group Into Collection" width="90" x="648" y="34">
    <parameter key="group_by_attribute" value="userid"/>
    </operator>
    <operator activated="true" class="loop_collection" compatibility="8.1.000" expanded="true" height="82" name="Loop Collection" width="90" x="782" y="34">
    <process expanded="true">
    <operator activated="true" class="sort" compatibility="8.1.000" expanded="true" height="82" name="Sort (2)" width="90" x="179" y="34">
    <parameter key="attribute_name" value="time"/>
    </operator>
    <connect from_port="single" to_op="Sort (2)" to_port="example set input"/>
    <connect from_op="Sort (2)" from_port="example set output" to_port="output 1"/>
    <portSpacing port="source_single" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="append" compatibility="8.1.000" expanded="true" height="82" name="Append" width="90" x="916" y="34"/>
    <connect from_op="Create ExampleSet" from_port="output" to_op="Sort" to_port="example set input"/>
    <connect from_op="Sort" from_port="example set output" to_op="Numerical to Polynominal" to_port="example set input"/>
    <connect from_op="Numerical to Polynominal" from_port="example set output" to_op="Set Role" to_port="example set input"/>
    <connect from_op="Set Role" from_port="example set output" to_op="Group Into Collection" to_port="exa"/>
    <connect from_op="Group Into Collection" from_port="col" to_op="Loop Collection" to_port="collection"/>
    <connect from_op="Loop Collection" from_port="output 1" to_op="Append" to_port="example set 1"/>
    <connect from_op="Append" from_port="merged set" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>

     Hope this helps.

     

    Cheers,

Sign In or Register to comment.