Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
Difference of two dates?
I'm trying to filter the examples I have so that the date in one attribute is always smaller than in another. Basically, what I need is that "If date1 < date2 then keep the example, otherwise throw it away".
I can't seem to be able to do this very easily (but I suspect this should be an easy operation). Filter Examples doesn't seem to accept dates. When I convert my two attributes to integers, then Filter Examples complains that the left hand side attribute is not numerical (it is!) when I try to use the attribute_value_filter with "date1_day > date2_day". So I searched for something else and wanted to try the Generate Aggregates function so I can create a new attribute that's either larger than zero or not, but the function does only sums and such, whereas I would need to subtract one from the other number.
As I think I'm beginning to overcomplicate the solution I would appreciate if someone could help me out with some hints.
Thanks
I can't seem to be able to do this very easily (but I suspect this should be an easy operation). Filter Examples doesn't seem to accept dates. When I convert my two attributes to integers, then Filter Examples complains that the left hand side attribute is not numerical (it is!) when I try to use the attribute_value_filter with "date1_day > date2_day". So I searched for something else and wanted to try the Generate Aggregates function so I can create a new attribute that's either larger than zero or not, but the function does only sums and such, whereas I would need to subtract one from the other number.
As I think I'm beginning to overcomplicate the solution I would appreciate if someone could help me out with some hints.
Thanks
0
Answers
yes you are right, it suppose to be an easy operation, unfortunatley it isn't until the next RM release (about mid december).
But you've been on a very good track, so close to the solution: In the next RM release the value filter can compare also date(att1) < date(att2) or similar operations.
I hope I could help,
Seabstian
Hey, I also have a question regarding dates.
I have a list with user_ids and a user has multiple dates for example:
What is the expression for something like:
Look for ID. Count the first date for this ID till the last date for this ID and if the sum is more than 2, delete the data for this ID. And consinder that if the same date appears more than once for the ID take it only as one day. 12 would be deleted and only 19 would stay in the example.
With the current operator "Filter Examples" I found "condition class" to get "parameter expression" but I am not sure how to get the expression.
Has anyone an idea?
Regards
MBM
Hi MBM,
i think you need to use quite some aggregation here.
First aggregate and group by userID AND Date, delete everything which has less than 2 and use set minus to delete it from the orignal data set. That should satisfy condition 2.
For the first condition: Is your data always sorted in time? In this case, you can aggregate min(date) and max(date), calculate date_diff and do the same filtering thing.
Best.
Martin
Dortmund, Germany
Hey mschmitz,
I assume yes, my data should be sortet in time. I read an old thread here and I first sorted by date and after that by id. Now I have a huge list grouped by id and with the dates. I think your second suggestion makes sense. If I understand correctly I need the minimum date of an id and the maximum date of an id and then use date_diff to get the days. But how can I say "Give me to a certain id the minimum date and the maximum date"? For date_diff I need those two dates.
Thanks in advance
MBM
Hey MBM,
Take aggregate and calculate min(date) and max(date) and group by id should do the job.
~Martin
Dortmund, Germany
works fine
thank you!