Aggregate based on values in two tables

kcallankcallan Member Posts: 4 Contributor I
edited December 2019 in Help

I have been struggling with what I feel should be a fairly simple problem and am going around in circles at this point. I have some data relating to application processing times. The main table contains application start and completion dates, and I am trying to get a count of the number of applications which are active (ie. started but not yet completed) at the end of a given quarter. Missing end dates indicated that the application is not yet complete. I have an aggregated table which contains quarter end dates and some other data which has been aggregated from the main table. I want to add the count of active applications for each quarter end date to this table. See sample data below. Help would be appreciated!

 

Quarter end dates.PNGQuarter end dates (in aggregated table)Start and End Dates.PNGApplication start and end dates (in main table)

Tagged:

Answers

  • kypexinkypexin Moderator, RapidMiner Certified Analyst, Member Posts: 291 Unicorn

    Hi @kcallan

     

    I would probably approach the problem that way: 

     

    1. In each table, generate a new attribute 'quarter' to encode all the quarters in a unique way, like 2012_3, 2012_4 etc
    2. In the second table, generate numerical attribute 'finished' with two values 0 / 1, 0 being unfinished applications and 1 finished ones
    3. Aggregate second table with grouping by 'quarter' and with aggregation function sum of the above 'finised' attribute (so for each quarter you get sum of all 1's)
    4. Join tables on 'quarter' variable
Sign In or Register to comment.