The RapidMiner community is on read-only mode until further notice. Technical support via cases will continue to work as is. For any urgent licensing related requests from Students/Faculty members, please use the Altair academic forum here.
Function description
dear all,
Ive a data set in which age of the subject is given as an attribute and the values are given in
either months or years or in weeks.
eg: 3 days , 8 weeks , 10 months
I want to convert that attribute in to no. of days, so that i can group them based on no. of days. I was trying
to use functions - 'finds" and "parse", but not successful. can someone helps me on this. thank you.
regds
thiru
Ive a data set in which age of the subject is given as an attribute and the values are given in
either months or years or in weeks.
eg: 3 days , 8 weeks , 10 months
I want to convert that attribute in to no. of days, so that i can group them based on no. of days. I was trying
to use functions - 'finds" and "parse", but not successful. can someone helps me on this. thank you.
regds
thiru
0
Best Answer
-
kayman Member Posts: 662 Unicorn@Thiru, Now I get it :-)
In this case use 'contains' is what you need, in combination with the if statement.So if Age pet contains year then get number times 365, else if Age pet contains month get number times 30 and so on.So something like this :if(contains([Age pet],"year"),
parse(replaceAll([Age pet],"\\D",""))*365,
if(contains([Age pet],"month"),
parse(replaceAll([Age pet],"\\D",""))*30,
if(contains([Age pet],"week"),
parse(replaceAll([Age pet],"\\D",""))*7,
parse(replaceAll([Age pet],"\\D","")))))
5
Answers
The idea is that you 'regenerate' your existing attribute, so you just use your existing attribute name, but generate new content for it.
The generate attribute operator contains all the search, replace, splice, trim and other functions you will need
thnx for your reply. I was only referring to the function description in 'generate attribute' operator.
I couldn't get the syntax of the function correctly. can I have some help to set it right?
Regular expressions are your friend here, but they can be frightening if you're not used to them.
Try something as below :
(start a new process, copy the xml, open view -> show panel ->xml -> paste -> green tick in top corner to validate and store -> back to process window)
What is does is create a new field (but you can also overwrite your existing field), uses a regular expression to remove everything that's not a digit (using \D ) and then parses it.
Now for weeks you can safely multiply by 7, for months it's not so straightforward so I just took an average of 30.
Finally I used the aggregation operator to sum them all up.
Note that in reality you can combine all of the above in a single expression using the generate attribute, but it can become a bit unreadable then.
this improves. But this will work if consider those values as three different attributes. But all these row values are part of single attribute "Age". I think this will need a different function ?
But this will work if we consider those values as three different attributes.
But there, all these row values are part of single attribute 'age'. I think this will need a
different function?
something like this :
input 3 days , 8 weeks , 10 months
output = 359
All these are different rows of a single attribute. Means - 3 days can be one row, 8 weeks another row , 10 months
another row.. , 6 years can be an another one. like that there are many rows.
Im enclosing the sample of that attribute " Age pet". kindly have a look on it.