Options

Count Price Changes

gianluca_scheidgianluca_scheid Member Posts: 10 Contributor I
Dear RapidMiner Community

I have a big data set with product name, retailer name and price at different points in time (time stamps).
My goal is to group the data by product and retailer name and count the number of price changes across all time stamps.

Is there an easy way to do this?

Thank you for your help

Best regards
GL
Tagged:

Best Answer

Answers

  • Options
    MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,510 RM Data Scientist
    i think you can just use the Aggregate operator. Group by product name and relatailer and use count on price to get the #.
    BR,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • Options
    gianluca_scheidgianluca_scheid Member Posts: 10 Contributor I
    Hi Martin

    Thank you for your suggestion.

    I had already considered this option, however, it will only count the number of distinct prices but not the number of price changes:

    For example:

    t1: Product X, Retailer Y, 500USD
    t2: Product X, Retailer Y, 500USD
    t3: Product X, Retailer Y, 550USD
    t4: Product X, Retailer Y, 600USD
    t5: Product X, Retailer Y, 500USD
    t6: Product X, Retailer Y, 600USD

    When counting prices using the Aggregate function I'd get: Product X, Retailer Y, 3 distinct prices (500, 550, 600)
    However, the price changed 4 times (500 to 550, 550 to 600, 600 to 500, 500 to 600)

    Best regards
    Gianluca
  • Options
    gianluca_scheidgianluca_scheid Member Posts: 10 Contributor I
    Hi Martin

    Thank you for your help. Unfortunately, I don't know (yet) how to implement the xml code into my process...


    Best regards
    Gianluca
  • Options
    varunm1varunm1 Moderator, Member Posts: 1,207 Unicorn
    edited January 2019
    @gianluca_scheid

    You just need to open new process then go to menu bar View--> Show panel --> XML. Now you will get an XML window, delete code in XML window and just copy the complete code from @mschmitz and paste it there. Once you paste it click green tick mark so that you can see the processes in the process window. hope this helps

    Thanks,
    Varun
    Regards,
    Varun
    https://www.varunmandalapu.com/

    Be Safe. Follow precautions and Maintain Social Distancing

  • Options
    gianluca_scheidgianluca_scheid Member Posts: 10 Contributor I
    @varunm1 and @mschmitz: Thank you so much for your help :)
  • Options
    Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    You can also use the Lag operator in the Finance and Economics extension to calculate the price difference between rows---and any non-zero value is then a price change.  You could additionally aggregate on those if needed.
    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • Options
    MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,510 RM Data Scientist
    while this is of course technically possible, i would still recommend to use my options, since it is more memory efficient.
    BR,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • Options
    Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    Of course, @mschmitz, as is often the case I am simply pointing out another possible way of doing the same task.  There are always so many options in RapidMiner!
    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
Sign In or Register to comment.