Filter before an aggregation

sectynsectyn Member Posts: 25 Maven
I have a set of transactions with transaction dates and amounts. For any given transaction, I want to filter out the transactions for the past one week and calculate the sum of all the transaction amounts for the filtered out set. How can I perform this filter?

Best Answer

  • jacobcybulskijacobcybulski Member, University Professor Posts: 391 Unicorn
    Solution Accepted
    I am not sure if daily aggregates are appropriate here or do you need 'one week from the minute of a transaction'. Let's say this is applicable - I am sure this can be done more cleverly but here it goes. First split the dates into year, month and day and then perform daily aggregates on all transactions, where the groups are year, month and day and aggregate on the sum of transaction value, you will then deal only with daily cumulative values. You can then collapse the date back into a single date. You now have a time series. If you have any missing dates, you may wish to fill them in with operator Equalize Time Stamp.

    Now you are ready to transform your data by creating a window of seven days (see Windowing operator) so that in a single example you will have values for all seven days of each transaction, you can then calculate create a new attribute with 7 day sum.

    If you do not like windowing and wanted to use brute force, you could do this instead, Copy your daily aggregated data set 7 ways, add 7 joins and one step at a time rejoin them based on their dates, so that in a single example you have day, its cumulative value, then cumulative value for date - 1 day, for date - 2, ..., date - 7. At this point, you can replace all those new attributes with their sum and this is what you wanted, for each transaction you will have its daily cumulative values. 
Sign In or Register to comment.