Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

Any operator which works as "merge_asof"

Kumar_AyushKumar_Ayush Member Posts: 7 Learner I
I want to merge two dataframes on date time columns in both the dataframes but both date time are at different frequencies, how I can merge such data in rapid miner using operator

Answers

  • BalazsBaranyBalazsBarany Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 955 Unicorn
    Hi!

    There are two important questions here.
    How do you want to do the join? E. g. Timestamp B between Timestamp A - 1 and Timestamp A? What about multiple entries falling into this range? Multiple matches or just one?
    Are the frequencies very different or similar? 

    There is no easy solution for this and the answer depends on the answers to these questions.

    Some pointers:
    - You could convert the timestamps to a number and round that number to an identical point of time in both data frames (e. g. hours) and join on that.
    - You could convert the timestamps to a number, get the previous value of the timestamp for the join and use the Database Envy extension from the Marketplace and "Expression based join" for something like TsB >= TsA1 && TsB <= TsA.
    - There's "Equalize Time Stamps" for recalculating time series data at exact timestamps.

    So it depends on your use case.

    Regards,
    Balázs
  • btibert3btibert3 Member Posts: 3 Contributor I
    I haven't attempted this myself, but one idea might use the python extension, though this would require that you can input two different data sources and hop into pandas code.

    Check out the python extension and related github documentation.
Sign In or Register to comment.