"Fill Missing Values Based on other Attributes"

joshhazeljoshhazel Member Posts: 2 Contributor I
edited June 2019 in Help
I am in the data pre-processing stage still.
I have a data set like such:
age  class   sex
10     1st     male
25     2nd    female
40     3rd     male
There are other attributes that had missing values but I have used the missing operator "average" feature to fill them in.  However, the "age" attribute I would like to fill in missing data based on other columns, that is find out the average age of 1st class + male and apply that to to the missing,  or 2nd class + female, etc.   However using the missing operator there doesn't appear to be much leeway in options other than things like "average" etc.

How can I make my dream become reality?  
Tagged:

Answers

  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Use the Generate Attributes operator, and replace the age attribute with a formula similar to this: if(missing(age), replacement_expression, age)
    Replace replacement_expression with an expression matching your needs.

    Best regards,
    Marius
Sign In or Register to comment.