image

🎉 🎉 RAPIDMINER 9.10 IS OUT!!! 🎉🎉

Download the latest version helping analytics teams accelerate time-to-value for streaming and IIOT use cases.

CLICK HERE TO DOWNLOAD

How to Transpose with and generate incremental ID

j_quigleyj_quigley Member Posts: 8 Contributor I
edited December 2018 in Help

Hi all,

 

I have a data set as follows:

Item                             Seller

apples                          seller1

 

oranges                        seller2

apples                          seller2

kiwi                              seller2

sprouts                        seller 3

pineapple                    seller 3

 

 

I want to End up as follows:

Seller            Item 1        Item 2    Item 3  Item 4

Seller 1         apples          -              -          -

Seller 2         oranges      apples     kiwi      -

Seller 3         sprouts       pineapple  -        -    

 

I'm thinking I need to use the pivot operator but before that I need to generate an attribute containing the values (Item 1, Item 2, Item 3, Item 4 etc) 

 

How would I do that? Using a regular expression in the form 'Item (X)' where X increments with each row up to a MAX value which is determined by counting the number of each distinct seller attribute?

 

Any help would be much appreciated!

 

Answers

  • sgenzersgenzer 12Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959  Community Manager

    I thought it was easier to use Generate ID inside Loop Values:

     

    <?xml version="1.0" encoding="UTF-8"?><process version="8.1.000">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.1.000" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="retrieve" compatibility="8.1.000" expanded="true" height="68" name="Retrieve j_quigley" width="90" x="45" y="85">
    <parameter key="repository_entry" value="//RapidMiner OneDrive/random community stuff/j_quigley"/>
    </operator>
    <operator activated="true" class="concurrency:loop_values" compatibility="8.1.000" expanded="true" height="82" name="Loop Values" width="90" x="179" y="85">
    <parameter key="attribute" value="Seller"/>
    <process expanded="true">
    <operator activated="true" class="filter_examples" compatibility="8.1.000" expanded="true" height="103" name="Filter Examples" width="90" x="45" y="34">
    <list key="filters_list">
    <parameter key="filters_entry_key" value="Seller.equals.%{loop_value}"/>
    </list>
    </operator>
    <operator activated="true" class="generate_id" compatibility="8.1.000" expanded="true" height="82" name="Generate ID" width="90" x="179" y="34"/>
    <connect from_port="input 1" to_op="Filter Examples" to_port="example set input"/>
    <connect from_op="Filter Examples" from_port="example set output" to_op="Generate ID" to_port="example set input"/>
    <connect from_op="Generate ID" from_port="example set output" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="append" compatibility="8.1.000" expanded="true" height="82" name="Append" width="90" x="313" y="85"/>
    <operator activated="true" class="pivot" compatibility="8.1.000" expanded="true" height="82" name="Pivot" width="90" x="447" y="85">
    <parameter key="group_attribute" value="Seller"/>
    <parameter key="index_attribute" value="id"/>
    </operator>
    <connect from_op="Retrieve j_quigley" from_port="output" to_op="Loop Values" to_port="input 1"/>
    <connect from_op="Loop Values" from_port="output 1" to_op="Append" to_port="example set 1"/>
    <connect from_op="Append" from_port="merged set" to_op="Pivot" to_port="example set input"/>
    <connect from_op="Pivot" from_port="example set output" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>

    Screen Shot 2018-02-22 at 6.54.00 PM.png

     

    Scott

     

Sign In or Register to comment.