How to calculate the difference between every two years and k-means clustering question.

EvelynEvelyn Member Posts: 2 Newbie
edited December 2019 in Help

Currently this is my data view

In the result, first, i need to get the difference between revenue-expenses between every two years (2018-2017; 2017-2016..)
The result I get must be in percentage
May I know how to do this? I have tried using generate attribute, date diff also didnt work

Besides that, this is my k-means clustering results

Although the result is correct already, but my lecturer ask me to convert the result into percentage form?

Does anyone can help? Thanks in advance and I'm really appreciate your help.


  • Options
    Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    The answer to your first question is to use the Generate Attributes operator. Probably the issue you were facing is that your attribute names appear to be numbers, in which case you might have not referenced them correctly and thus RapidMiner didn't know that you were trying to calculate based on another attribute.  You do that by enclosing the attribute name in brackets.  Also, there are several different ways to generate percentages depending on the denominator you select. But one example would be to define a new attribute such as "2018_percent_change" as ([2018.0]-[2017.0])/[2017.0]
    You would of course need to do this for each attribute you want to create.  If you have a lot of these and the manual construction of these attributes is tedious, this could be automated with a loop and macros as well.
    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • Options
    sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager
Sign In or Register to comment.