Analyzing data from two related data sets

jeganathanvelu · August 2014

Hi,

I have two data sets : First data-set has application id and complaint registration time. Second data-set has application id, complaint Responses and response registration time. Second table will have multiple entries for each application.

My requirement is to identify the latest response based on response registration time in second table and map it against the application id in first table.

For mapping I can use join operator. But I dont know how to identify the latest reponse from second data-set using rapidminer.

Thanks for your help in advance,
Jegan

homburg · August 2014

Hi Jegan,

maybe you could provide more information regarding your second table, otherwise it is pretty hard to give you a hint what to do next. Have you considered to add an id to your table or generate one using RapidMiner?

Cheers,
Helge

jeganathanvelu · August 2014

Hi,

Thanks for the reply. My second table already had an ID for each entry and also a foreign key (as in RDBMS) to be used for look-up with the first table. The second table has multiple entries with the same foreign key.

While doing join I wanted to refer to the entry with latest time-stamp for each foreign key. I solved the issue by sorting the second table in descending order based on the time-stamp and used remove duplicate operator on the foreign key. This retained only entries with latest time-stamp for each foreign key. since Remove duplicate operator always retains the first entry only and removes other entries against a given attribute and I was able to do a join to get the desired result :-)

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

Analyzing data from two related data sets

Answers