Options

How to select the first date and the last of a group of data

demonlovesongdemonlovesong Member Posts: 16 Contributor I
edited December 2018 in Help

Hi, i have a group of sales data with multiple transaction,i am trying to use the sales data to find out the timeframe of different campaign of promotion. Within the transaction data, each item have multiple transaction and different discount. i will sort the data by the amount of discount first in order to get the date of different promotion. but now i wanted to abstract of first and last date of each campaign to determine the start date and the end date of the promotion. (this is a school project, that is why we are given limited data). Thank you

Answers

  • Options
    Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn

    You should be able to use Aggregate and then the functions Min and Max for the date field.  You may need to create a flag first for which records are showing price promotions (using Generate Attributes) so you can filter or group by that.

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • Options
    demonlovesongdemonlovesong Member Posts: 16 Contributor I

    thanks for the suggestion, however we face another problem.

     

    The transaction seems have same price at diff period For exampl

    Discount price time

    0.2                   June 1

    0.2                   June 2

    0.3                   June 3

    0.3                   June 4

    0.2                   June 5

    0.2                   June 6

    So i group the transaction by the discounted price and use min and max to find the first date and last, it will become

    0.2 June 1 -June 6

    However, i wanted to group the transaction in 

    0.2 june1 - june 2

    0.3 june3 - june 4

    0.2 june5 -june 6

    is it possible to acheive this ? thank you

  • Options
    Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn

    The data structure here is a bit messy!  I think you can probably accomplish what you want by first using the lag operator from the Series extension to identify every time the discount changes.  Then you should be able to create an indicator for a new discount (lag value > 0) and then add that to your "group by".  Or you may find it easier to create a new index variable that corresponds to the discount level and the number of times that discount has been used before (basically concatenating the discount and the lag counter described above).  Either way, I think that should get you want you want!

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • Options
    demonlovesongdemonlovesong Member Posts: 16 Contributor I

    Hi, i just look in the possiblity of the lag value, learn something new! However, it is okay to achieve by one item as you mannual look count the no.of coulumn that you want to skip, but i going to use this thing to filter massive no. of product and all their date range is different. Do you have any better suggestions? Thank you so much! 

  • Options
    MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,507 RM Data Scientist

    i think what you want to have is a Generate Session ID operator which works on absolutes.. mhh I will add this to the operator.

     

    Cheers,

    Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.