Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

[SOLVED] Grouping rows and summing each group

BrinningtonBrinnington Member Posts: 2 Contributor I
edited November 2018 in Help
Hi.

I would like your advice.

I have a dataset which has around 180 rows relating to one date, and another 180 rows relating to a second date, and so on.

I want to sum the attributes for those 180 rows of the first date, and I want to sum the atrributes of the 180 rows of the second date, etc.

I have tried to do this using the Aggreate operator, but it just produces averages. I can't get it to produce sums.

Do you happen to know if there is an operator which will sum the rows according to each date?

Or maybe I am using the Aggregate operator incorrectly.

Either way, I would glady appreciate your input and experience on this.

Thank you.

Bob

Answers

  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Hi,

    you have to set the date as grouping attribute and sth. else as aggregation attribute, as in the process below.

    Best,
    Marius
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.2.000">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.2.000" expanded="true" name="Process">
        <process expanded="true" height="145" width="413">
          <operator activated="true" class="generate_sales_data" compatibility="5.2.000" expanded="true" height="60" name="Generate Sales Data" width="90" x="112" y="30"/>
          <operator activated="true" class="aggregate" compatibility="5.2.000" expanded="true" height="76" name="Aggregate" width="90" x="313" y="30">
            <list key="aggregation_attributes">
              <parameter key="amount" value="sum"/>
            </list>
            <parameter key="group_by_attributes" value="|date"/>
          </operator>
          <connect from_op="Generate Sales Data" from_port="output" to_op="Aggregate" to_port="example set input"/>
          <connect from_op="Aggregate" from_port="example set output" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
  • BrinningtonBrinnington Member Posts: 2 Contributor I
    Hi Marius,

    Thank you very much for your guidance.

    I selected the attribute I wanted to aggregate through the Edit List next to the aggregation attributes, and chose sum as the aggregation function. Then I selected the date attribute using the Select Attriubtes net to group by attributes. And I got what I wanted: the sums for each grouping of 180 rows for each date.

    So once again, thank you.

    Best,

    Bob
Sign In or Register to comment.