Column order problem after pivoting

sharmar6sharmar6 Member Posts: 19 Maven
edited December 2018 in Help

Hi, 

I am facing an issue of ordering the cloumns after pivoting. The index attribute is Week no. Now when I pivot the data, the columns go like : Week-1, week-10, Week-11......Week-2, Week-20,.....Week-3, Week-30,....

Whereas I need it to be :  Week-1, Week-2, Week-3.....

<?xml version="1.0" encoding="UTF-8"?><process version="7.5.001">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="7.5.001" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="generate_macro" compatibility="7.5.001" expanded="true" height="82" name="Generate Macro" width="90" x="112" y="85">
<list key="function_descriptions">
<parameter key="end_date" value="date_str_custom(date_add(date_now(), -1, DATE_UNIT_DAY), &quot;yyyy-MM-dd&quot;)"/>
</list>
</operator>
<operator activated="true" class="subprocess" compatibility="7.5.001" expanded="true" height="82" name="Subprocess" width="90" x="246" y="85">
<process expanded="true">
<operator activated="true" class="jdbc_connectors:read_database" compatibility="7.5.001" expanded="true" height="68" name="Read Database (2)" width="90" x="45" y="34">
<parameter key="connection" value="DVDH_zDMaqsud"/>
<parameter key="query" value="SELECT `Candidate`&#10;FROM `Candidate_Name_List`&#10;-- where Candidate like '%cliff%'"/>
<enumeration key="parameters"/>
</operator>
<operator activated="true" class="r_scripting:execute_r" compatibility="7.2.000" expanded="true" height="82" name="Execute R" width="90" x="179" y="34">
<parameter key="script" value="# rm_main is a mandatory function, &#10;# the number of arguments has to be the number of input ports (can be none)&#10;rm_main = function(input_data)&#10;{&#10; library(gtrendsR)&#10; for (i in 1:nrow(input_data)) {&#10; &#9;&#9;trend &lt;- data.frame(gtrends(input_data$Candidate[i], time = &quot;2017-01-01 %{end_date}&quot;)$interest_over_time)&#10; &#9;&#9;if ( i == 1) {&#10; &#9;&#9;&#9;all_trends &lt;- trend&#10; &#9;&#9;} else {&#10; &#9;&#9;&#9;all_trends &lt;- rbind(all_trends, trend)&#10; &#9;&#9;}&#10; }&#10; all_trends$date &lt;- as.character(all_trends$date)&#10; return(all_trends)&#10;}&#10;"/>
</operator>
<operator activated="true" class="select_attributes" compatibility="7.5.001" expanded="true" height="82" name="Select Attributes (2)" width="90" x="313" y="34">
<parameter key="attribute_filter_type" value="subset"/>
<parameter key="attributes" value="date|keyword|hits"/>
</operator>
<operator activated="true" class="nominal_to_date" compatibility="7.5.001" expanded="true" height="82" name="Nominal to Date (2)" width="90" x="447" y="34">
<parameter key="attribute_name" value="date"/>
<parameter key="date_format" value="yyyy-MM-dd"/>
</operator>
<connect from_op="Read Database (2)" from_port="output" to_op="Execute R" to_port="input 1"/>
<connect from_op="Execute R" from_port="output 1" to_op="Select Attributes (2)" to_port="example set input"/>
<connect from_op="Select Attributes (2)" from_port="example set output" to_op="Nominal to Date (2)" to_port="example set input"/>
<connect from_op="Nominal to Date (2)" from_port="example set output" to_port="out 1"/>
<portSpacing port="source_in 1" spacing="0"/>
<portSpacing port="source_in 2" spacing="0"/>
<portSpacing port="sink_out 1" spacing="0"/>
<portSpacing port="sink_out 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="date_to_numerical" compatibility="7.5.001" expanded="true" height="82" name="Date to Numerical" width="90" x="380" y="85">
<parameter key="attribute_name" value="date"/>
<parameter key="time_unit" value="week"/>
<parameter key="keep_old_attribute" value="true"/>
</operator>
<operator activated="true" class="sort" compatibility="7.5.001" expanded="true" height="82" name="Sort" width="90" x="514" y="85">
<parameter key="attribute_name" value="date_week"/>
</operator>
<operator activated="true" class="rename" compatibility="7.5.001" expanded="true" height="82" name="Rename" width="90" x="648" y="85">
<parameter key="old_name" value="keyword"/>
<parameter key="new_name" value="Candidate"/>
<list key="rename_additional_attributes"/>
</operator>
<operator activated="true" class="generate_attributes" compatibility="7.5.001" expanded="true" height="82" name="Generate Attributes (3)" width="90" x="782" y="85">
<list key="function_descriptions">
<parameter key="Scaled_Value" value="((hits)*100)"/>
</list>
</operator>
<operator activated="true" class="select_attributes" compatibility="7.5.001" expanded="true" height="82" name="Select Attributes" width="90" x="916" y="85">
<parameter key="attribute_filter_type" value="subset"/>
<parameter key="attributes" value="Candidate|date_week|hits|Scaled_Value"/>
</operator>
<operator activated="true" class="aggregate" compatibility="7.5.001" expanded="true" height="82" name="Aggregate" width="90" x="1050" y="85">
<list key="aggregation_attributes">
<parameter key="Scaled_Value" value="sum"/>
</list>
<parameter key="group_by_attributes" value="date_week|Candidate"/>
</operator>
<operator activated="true" class="pivot" compatibility="7.5.001" expanded="true" height="82" name="Pivot" width="90" x="1184" y="85">
<parameter key="group_attribute" value="Candidate"/>
<parameter key="index_attribute" value="date_week"/>
</operator>
<operator activated="true" class="rename_by_replacing" compatibility="7.5.001" expanded="true" height="82" name="Rename by Replacing" width="90" x="1318" y="85">
<parameter key="include_special_attributes" value="true"/>
<parameter key="replace_what" value="(sum\(Scaled_Value\)_)"/>
<parameter key="replace_by" value="Week-"/>
</operator>
<connect from_port="input 1" to_op="Generate Macro" to_port="through 1"/>
<connect from_op="Generate Macro" from_port="through 1" to_op="Subprocess" to_port="in 1"/>
<connect from_op="Subprocess" from_port="out 1" to_op="Date to Numerical" to_port="example set input"/>
<connect from_op="Date to Numerical" from_port="example set output" to_op="Sort" to_port="example set input"/>
<connect from_op="Sort" from_port="example set output" to_op="Rename" to_port="example set input"/>
<connect from_op="Rename" from_port="example set output" to_op="Generate Attributes (3)" to_port="example set input"/>
<connect from_op="Generate Attributes (3)" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
<connect from_op="Select Attributes" from_port="example set output" to_op="Aggregate" to_port="example set input"/>
<connect from_op="Aggregate" from_port="example set output" to_op="Pivot" to_port="example set input"/>
<connect from_op="Pivot" from_port="example set output" to_op="Rename by Replacing" to_port="example set input"/>
<connect from_op="Rename by Replacing" from_port="example set output" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>

It seems it is sorting out based on first digit of the week no.

Please advise.

 

 

Best Answer

  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    Solution Accepted

    Another solution is to modify the values of week BEFORE the pivot.  If you use "Format Numbers" you can specify that week number should always have 2 digits by using the pattern "00" (assuming you have fewer than 100 weeks, if not just expand accordingly).  That will force a leading zero into week number where needed, so then after you Pivot, it will appear in the correct order.

     

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts

Answers

  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    hello @sharmar6 - yes that's a very common annoyance.  The quick-and-dirty fix is to use "Reorder Attributes" and then select via "user specified".  The long-term fix is to post the idea in the Product Ideas forum and hope the dev team picks it for revision.  :)

     

    Scott

     

  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    @Telcontar120 why, after almost 5 years of using RM, have I never seen this operator?  Does this happen to other users or is it just me?  Thanks.

     

    Scott

     

  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn

    Haha, @sgenzer it's definitely not just you.  I'm still finding new things all the time (or creative misuse of existing operators), and it's going on 7 years using RapidMiner for me!

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • sharmar6sharmar6 Member Posts: 19 Maven

    Thanks a lot.

  • BetsyCBetsyC Member Posts: 1 Contributor I
    I'm seeing the posts here and wondering if you are correct, @sgenzer .  I can't reorder the attributes after I pivot because they show as a cluster and not as separate attributes.  Sorry, I'm kind of a beginner - can someone help me understand how to reorder after a pivot?  Thanks a million.
  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    You may not be able to do it inside the wizard, which it sounds like is what you are using.  But if you open the actual process and then add the Reorder Attributes operator and select user specified order you should be able to put them in any order you want.  Try that and see if it solves your problem.
    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
Sign In or Register to comment.