Joining time series with time as index
I am trying to do something similar as in this post:
I have two source files with each having basically two attributes: "timestamp" and "att". Data is only recorded if a value in "att"changes and the values of source file 1 and source file 2 change independently. Therefore, I am having two data collections that have the same timeframe from start to end, but the individual datapoints in between have to a large extent different timestamp values and a different amount of datapoints (i.e. examples).
I now want to join those two source files by basically treating the timestamp of one of the two files as index (the one with more entries). Hence, I would like to be able to answer the following question:
At timestamp t, att of source 1 had a value of x and att of source 2 a value of y.
As RapidMiner can only join exact index matches (i.e. no closest value), I would like to generate an index attribute which basically takes the current value of att from source 2 and finds the closest value to it in the array of all att values in source 1. As a result, source file 2 would now have three columns (timestamp, att and index).
I have looked around a lot and tried to find a solution with Macros and Loops, but I just don't manage to get the desired result. Rounding the timestamp values as proposed in the post above does not work for me, as precision is key to my process.
I can think of several use cases for such an operator, so I am convinced that there has to be a solution in which I can generate an attribute by looping through the array of existing attribute values and finding the closest value.
I would be really grateful, if someone could point me in the right direction (preferably without using the ExecuteScript operator).
Thank you very much.