How to create an attribute with the difference

hugomsoutohugomsouto Member Posts: 4 Contributor I
Hi, everyone.

I am stucked. I need to create an attribute with the difference betweem a value from another attribute of the same row and the value of that same attribute from the row just above.

In Excel, it would be like, on the colum "=[@[lanData_time]]-A(numer of the row just above)". Anyone could help me?

Thanks!

Best Answer

  • tftemmetftemme Posts: 103  RM Research
    Solution Accepted
    Hi @hugomsouto

    I am not completely sure, but do you want to create the differentation of an Attribute: a_i - a_i-1. Try out the Differentiate operator from the time series operators. Deselect 'overwrite attributes' to create a new attribute containing the differentiated values.

    Best regards,
    Fabian

Answers

  • jacobcybulskijacobcybulski Member, University Professor Posts: 83   Unicorn
    edited June 7
    OK, here is not a very elegant solution. Let's say we want to find the difference between people's ages (Age attribute). I have create an attribute to collect differences (AgeDiff) and they are undefined. Set a macro (such as PrevAge) with the initial value of the attribute of interest (Age), here is zero. Then loop over your examples.
    At each iteration:

    Extract the value of the attribute of interest into macro CurrAge, calculate difference between  CurrAge and PrevAge and save it in AgeDiff macro, then set the value of the attribute AgeDiff in the example pointed to by the loop index to the value of macro AgeDiff, finally set the macro PrevAge to CurrAge.

    Done:

    Jacob (see the example attached as RMP)
    sgenzerhugomsouto
  • jacobcybulskijacobcybulski Member, University Professor Posts: 83   Unicorn
    This one is probably even less elegant and definitely not as efficient.



    You duplicate you input, add a dummy example to the copy, merge them both placing the values of the preceding attribute with the next, calculate the difference, get rid of extra example at the end (somehow) and select only the attributes of interest (because of duplicated attributes).

    Jacob

    hugomsouto
  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,226   Unicorn
    Have you tried the Lag operator from the Time Series group of operators?

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
    sgenzerhugomsouto
  • jacobcybulskijacobcybulski Member, University Professor Posts: 83   Unicorn
    Great idea @Telcontar120 in which case we do not need to make this awkward duplication and merging of out of sync example sets.
    hugomsouto
  • hugomsoutohugomsouto Member Posts: 4 Contributor I
    I saw the solution and forgot to thank you, @tftemme, that was exactaly what I needed. So fast an easy, only took a previous sorting.

    Thanks @Telcontar120, you were almost on target. Thank you @jacobcybulski for all the efforts!
    sgenzer
Sign In or Register to comment.