Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

Aggregate operator applied to each subset

SylvainMSylvainM Member Posts: 18 Maven
edited August 2020 in Help
Hello everyone,

I apologize if this question has already been asked elsewhere, or if it is an obvious one. I'm still learning how to use Rapidminer :smile:

This is my problem. Let's suppose that I have a dataset looking like that (but much more different values):

Year Region Item
01      QC      CCD
01      QC      CCD
01      QC      CS

01      ON      CCD
01      ON      CS

01      NB      CCD
01      NB      CS
--------------------------
02      QC      CCD
02      QC      CS
02      QC      CS

02      ON      CS
02      ON      CS

02      NB      CCD
02      NB      CCD

I would like to get the relative percentage of each Item related to the Region and to the Year

Year Region Item   Proportion
01      QC      CCD   66.6%
01      QC      CS      33.3%

01      ON      CCD   50%
01      ON      CS      50%

01      NB      CCD    50%
01      NB      CCS    50%
-------------------------------------
02      QC      CCD    33.3%
02      QC      CS       66.6%

02      ON      CS       100%

02      NB      CCD     100%

I tried many combinations with the operators Aggregate, Loop values, Branch, etc. but I seem to constantly fail... 

Do you have any suggestion?

Thanks a lot!
Sylvain
Tagged:

Best Answer

Answers

  • BalazsBaranyBalazsBarany Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 955 Unicorn
    Hi,

    what you're trying to achieve is called "window functions" in SQL. 

    You should check out this project, it's an implementation of window functions in RapidMiner.
    https://github.com/bbarany/rapidminer-windowfunctions

    You can calculate groupwise sums or counts, generate the ratios, and then aggregate according to your needs. 

    Regards,

    Balazs
  • SylvainMSylvainM Member Posts: 18 Maven
    Thanks BalazsBarany  and SGolbert for your help,

    Your solution, Sebastian, is perfect! Thank you so much! :smiley:

    Best regards to both of you,
    Sylvain


Sign In or Register to comment.