Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

missing values

Legacy UserLegacy User Member Posts: 0 Newbie
edited November 2018 in Help
Hello,

I'm new to RapidMiner and perhaps I only didn't find the switch...

Is RapidMiner able to handle missing values?
i.e. can I do a linear regression or neural net learning with missing values in the input data?

To get it right: I don't want to replace the missing values.

Regards

Udo

Answers

  • IngoRMIngoRM Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi Udo,

    there are some learners which directly supports missing values (but there is no switch for that). However, there is also a more generic approach (without the need for replacing the values by something like the average): you could just define a new "category" (let's call it "missing") and use the AttributeValueMapper to map all "?" to this new (artificial) category (this of course only works on nominal data). Then you can apply all learners and they will take the "missing" information into account.

    In the following example, I replaced the nominal missing values with the new category "missing" and the numerical missings simply with the average. As you can see, the decision tree learner actually uses the new category (and it is also really important to use this information as you can easily see):

    <operator name="Root" class="Process" expanded="yes">
        <operator name="ExampleSource" class="ExampleSource">
            <parameter key="attributes" value="../data/labor-negotiations.aml"/>
        </operator>
        <operator name="NominalFeatureIterator" class="FeatureIterator" expanded="yes">
            <parameter key="type_filter" value="nominal"/>
            <operator name="NominalMissingCategory" class="AttributeValueMapper">
                <parameter key="attributes" value="%{loop_feature}"/>
                <parameter key="replace_by" value="missing"/>
                <parameter key="replace_what" value="?"/>
            </operator>
        </operator>
        <operator name="NumericalFeatureIterator" class="FeatureIterator" expanded="yes">
            <operator name="MissingValueReplenishment" class="MissingValueReplenishment">
                <list key="columns">
                  <parameter key="%{loop_feature}" value="average"/>
                </list>
            </operator>
        </operator>
        <operator name="DecisionTree" class="DecisionTree">
        </operator>
    </operator>
    Hope that helps,
    Ingo
Sign In or Register to comment.