The Altair Community is migrating to a new platform to provide a better experience for you. The RapidMiner Community will merge with the Altair Community at the same time. In preparation for the migration, both communities are on read-only mode from July 15th - July 24th, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here.

[SOLVED] Grouping rows and summing each group

BrinningtonBrinnington Member Posts: 2 Contributor I
edited November 2018 in Help

I would like your advice.

I have a dataset which has around 180 rows relating to one date, and another 180 rows relating to a second date, and so on.

I want to sum the attributes for those 180 rows of the first date, and I want to sum the atrributes of the 180 rows of the second date, etc.

I have tried to do this using the Aggreate operator, but it just produces averages. I can't get it to produce sums.

Do you happen to know if there is an operator which will sum the rows according to each date?

Or maybe I am using the Aggregate operator incorrectly.

Either way, I would glady appreciate your input and experience on this.

Thank you.



  • Options
    MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn

    you have to set the date as grouping attribute and sth. else as aggregation attribute, as in the process below.

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.2.000">
      <operator activated="true" class="process" compatibility="5.2.000" expanded="true" name="Process">
        <process expanded="true" height="145" width="413">
          <operator activated="true" class="generate_sales_data" compatibility="5.2.000" expanded="true" height="60" name="Generate Sales Data" width="90" x="112" y="30"/>
          <operator activated="true" class="aggregate" compatibility="5.2.000" expanded="true" height="76" name="Aggregate" width="90" x="313" y="30">
            <list key="aggregation_attributes">
              <parameter key="amount" value="sum"/>
            <parameter key="group_by_attributes" value="|date"/>
          <connect from_op="Generate Sales Data" from_port="output" to_op="Aggregate" to_port="example set input"/>
          <connect from_op="Aggregate" from_port="example set output" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
  • Options
    BrinningtonBrinnington Member Posts: 2 Contributor I
    Hi Marius,

    Thank you very much for your guidance.

    I selected the attribute I wanted to aggregate through the Edit List next to the aggregation attributes, and chose sum as the aggregation function. Then I selected the date attribute using the Select Attriubtes net to group by attributes. And I got what I wanted: the sums for each grouping of 180 rows for each date.

    So once again, thank you.


Sign In or Register to comment.