Options

Adding all values in one column

amotleyamotley Member Posts: 17 Contributor II
edited November 2018 in Help

Hello, 

 

I have a column of data with a bunch of numeric values in it. I am trying to add up all the values for a total sum. However, all the operators I have tried have simple added all the numeric values across rows and then given me the total for that specific row. 

 

Is there a way I can add all the values in only one column and produce a total sum? 

 

Thanks!

Best Answer

  • Options
    Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    Solution Accepted

    Amotley,

     

    I think the aggregate operator is the operator you need, but you may not be familiar with how to use it.  Take a look at this quick sample process (see the xml below, which you should be able to paste into your own RapidMiner window to see the process).

    All this does is generate some random sales data (100 examples, or rows) and then sum the "single price" attribute (column) to get a total.

    Also note that the "aggregate" operator allows you to get other summary statistics (perhaps you want the average as well as the sum) and also create subtotals (the parameter "group by" specifies this) if you want.

     

     

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="7.1.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="7.1.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="generate_sales_data" compatibility="7.1.001" expanded="true" height="68" name="Generate Sales Data" width="90" x="112" y="136"/>
    <operator activated="true" class="aggregate" compatibility="7.1.001" expanded="true" height="82" name="Aggregate" width="90" x="313" y="136">
    <list key="aggregation_attributes">
    <parameter key="single_price" value="sum"/>
    </list>
    </operator>
    <connect from_op="Generate Sales Data" from_port="output" to_op="Aggregate" to_port="example set input"/>
    <connect from_op="Aggregate" from_port="example set output" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts

Answers

  • Options
    MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,520 RM Data Scientist

    Hi amotley,

     

    have you had a look on the Aggregate operator? I think it solves your problem

     

    Best,

    Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • Options
    amotleyamotley Member Posts: 17 Contributor II

    Martin, 

     

    Yes I have tried that operator. It still is not adding up all the values in one single column that I would like it to. 

    For example, if I have a column with these values entered:

    1

    2

    3

    4

    I want to be able to calculate the sum to be 10.

    I'm not finding a way to do that using the aggregate operator. 

    Is there a way to do that? 

     

    Thanks :)

  • Options
    bhupendra_patilbhupendra_patil Administrator, Employee, Member Posts: 168 RM Data Scientist

    Check this process

     

    https://github.com/patilbhupendra/Sample_RapidMiner_Processes/blob/master/add%20value%20in%20a%20column.rmp

     

    Notice there is no group by used

     

    Hopefully it helps you

  • Options
    amotleyamotley Member Posts: 17 Contributor II

    That helped a lot! Thank you!!

Sign In or Register to comment.