Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
"How to aggregate age in polynominal type"
Hello, all
In my raw data set, I got age attribute as following example:
John, 64,
Alice, 33years,
Bob, 22years,
Mike, 50
So some of the value with a redundant 'years' at the end. What I eventually need is to check the average of age, and also group example into age group ( 0-9,10-19,20-29.and etc )
1) If i set the attribute type as integer when read file, then every example with redundant 'years' will only get as missing value.
2) If I read the attribute as polynominal, and then use replace operator to remove the redudant part from attribute value. and then apply nominal to numeric, but still what I got is not a column with numeric type
is there a workaround for that?
Tagged:
0
Answers
Hi,
read in as polynominal, than use Replace on it with a regex like:
(\d+).+
replace by
$1
that way you have only the digits in the column.
Then you transform it to numerical with the Parse Numbers operator.
Afterwards, you can use one of the Discretize operators to get bins and Aggregate to get avg() per Bin.
Cheers,
Martin
Edit: And here is an example for it. Maybe we need to adjust this regex a bit
Dortmund, Germany