'% of total' and 'group by'... help needed

LilCLilC Member Posts: 24 Learner I
edited July 2020 in Help
Hey guys,
In the case of different stores and sales:
Firstly, I need to add a column to calculate for each store to see the percentage of total sales of the whole dataset.   (Update: I just learned Extract Marco, I got this part sorted, just need to find out how to show 1.001%, not just 0.01001 )

Then, I want to group the first top 10 store. So the top 10 will be labelled as top stores, and the other stores will be labelled as others.

Basically, the final result I want to get is a pie chart to see the top stores' percentage of sales against others.

I kinda know it is possibly aggregation operator, but I still can not find out how to make it work.
The other option is to use Turbo Prep, I am open to this as well.

Thanks for all your help. 

Best Answer


  • Options
    BalazsBaranyBalazsBarany Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 955 Unicorn

    If you sort by the sales descending and use Filter Example Range, you can easily select the top 10. But do the global sum calculation before that.

    Then you use Generate Attribute to get the current sales percentage in relation to the total.

    There's an operator Format Numbers that takes numerical columns and applies a format string to them. This should help you with the percent format.

    If you do it in a process, you can always execute it again with new numbers. With Turbo Prep you'd need to do the same steps again when you get new numbers. (But you can save the Turbo Prep steps as a process, too.)

    I wouldn't use a pie chart that just shows the top 10 unless there's also a "all the others" category. It would create a false impression.


  • Options
    LilCLilC Member Posts: 24 Learner I
    Hi, this is I am trying to do, group the others, and show top 10. But not sure how to group up.
Sign In or Register to comment.