10-15-2016 05:44 PM
Hi guys, need some help here.
I have data that is organised at 1 second intervals, for each second it keeps track of the user and the product that was traded and the value of that position.
i have multiple users and each user can trade various products.
how do i,
1. classify users into usergroups?
2. obtain the change in each user-product valuation
2. group that resultant valuation into customisable blocks of time?
sample data set:
frank and frank2 would be grouped together as frankie
the first 2 seconds is 1 block.
the next 2 seconds is another block.
so i can see... in the first 2 seconds. frankie made 0 , melissa made -1
in the next 2 second block. frank seashells made 6, frank cockels made 3, melissa 4.
and also group the product together. so seashells/cockels would be grouped as seafood.
so seafood made -$1 in first 2 seconds.
and seafood made $10 in next 2 seconds.
10-16-2016 12:11 PM
i think the way to go is first to use a map or replace operator to unify the name and afterwards one aggregate with sum of value and group_by product timestamp and user.
10-16-2016 06:34 PM
Hi Martin, thanks for your help.
What about finding the difference from one timestamp to another, between product timestamp and user,
can I create a new column called value_difference ?