Options

"Polynominal to binominal and aggregate behaviour"

frankstylefrankstyle Member Posts: 3 Contributor I
edited June 2019 in Help
Hi everybody, I'm pretty new to RapidMiner, but feel it is a great tool.
I really don't know how to solve a couple of data conversion tasks, I'm sure here there are many experts that can help me :-)

1) how can I create many binominal attributes starting from a polynominal attribute? I mean, I have this attribute:
color
white
orange
black
...and I want to convert it to 3 binominal attributes
white    orange  black
1          0            0
0          1            0
0          0            1

2) next, I love the aggregate operator, but it seems to me it doesn't work like the SQL-one. I.e., in RapidMiner it loses all the attributes it hasn't aggregated or grouped.... how can I get in RapidMiner something like this (silly) SQL statement?

SELECT name,birthday,email,SUM(monthlysalary) as totalsalary FROM mytable GROUP BY email

Thank you so much for your help!
Tagged:

Answers

  • Options
    frasfras Member Posts: 93 Contributor II
    I put everything into one process that illustrates the RapidMIner way of doing this kind of stuff.
    If you would like to do further calculations wiht aggregated attributes like "sum(..." you have to replace
    the brackets first:

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="6.0.003">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="6.0.003" expanded="true" name="Process">
        <process expanded="true">
          <operator activated="true" class="generate_team_profit_data" compatibility="6.0.003" expanded="true" height="60" name="Generate Team Profit Data" width="90" x="45" y="75"/>
          <operator activated="true" class="aggregate" compatibility="6.0.003" expanded="true" height="76" name="Aggregate" width="90" x="179" y="75">
            <list key="aggregation_attributes">
              <parameter key="average years of experience" value="sum"/>
            </list>
            <parameter key="group_by_attributes" value="leader"/>
          </operator>
          <operator activated="true" class="join" compatibility="6.0.003" expanded="true" height="76" name="Join" width="90" x="313" y="75">
            <parameter key="use_id_attribute_as_key" value="false"/>
            <list key="key_attributes">
              <parameter key="leader" value="leader"/>
            </list>
          </operator>
          <operator activated="true" class="nominal_to_numerical" compatibility="6.0.003" expanded="true" height="94" name="Nominal to Numerical" width="90" x="447" y="75">
            <parameter key="attribute_filter_type" value="single"/>
            <parameter key="attribute" value="leader"/>
            <list key="comparison_groups"/>
          </operator>
          <connect from_op="Generate Team Profit Data" from_port="output" to_op="Aggregate" to_port="example set input"/>
          <connect from_op="Aggregate" from_port="example set output" to_op="Join" to_port="left"/>
          <connect from_op="Aggregate" from_port="original" to_op="Join" to_port="right"/>
          <connect from_op="Join" from_port="join" to_op="Nominal to Numerical" to_port="example set input"/>
          <connect from_op="Nominal to Numerical" from_port="example set output" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>

  • Options
    frankstylefrankstyle Member Posts: 3 Contributor I
    Awesome. It works great!
    Thank you so much, fras
Sign In or Register to comment.