Remove features, train, and then add back features

noah977noah977 Member Posts: 32 Maven
edited November 2018 in Help
This is a bit of a complicated problem, so I'll do my best to describe it sequentially

I actually want to train two "systems".  The second system will learn from the output of the first.

My challenge is how to:
1) Keep all the features in my example set
2) Learn a model on a subset of features
3) Create a NEW ATTRIBUTE with the predicted value
4) Output ALL the features PLUS the new attribute.

This would be fairly easy, except for my need to train a model on a subset of features.

Does anybody have any ideas about how to handle this??

Thanks!

Here is an outline for what I am trying to do:

1) Input is data set with 10 features.
2) I want to perform regression on 8 of the features. 
    2a) Easy enough with the feature name filter.  Just eliminate the two I'm not using for regression. (For example, remove patient's name.)
3) Create the regression model
4) Write out the model
5) Apply the model to the examples
6) Create a NEW example set with the model predicted values AND THE ORIGINAL 10 FEATURES
7) Write the new example set to disk

Answers

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi Noah,
    you could use the AttributeSubsetPreprocessing Operator. This will select only a subset of the features. Do your learning inside this meta operator and apply the model. After the AttributeSubsetPreprocessing has been finished, the resulting inner example set is merged with the remaining attributes from the original example set.

    Here is an example:
    <operator name="Root" class="Process" expanded="yes">
        <operator name="ExampleSource" class="ExampleSource" breakpoints="after">
            <parameter key="attributes" value="C:\Dokumente und Einstellungen\sland\Eigene Dateien\Yale\RapidMiner_Zaniah\sample\data\iris.aml"/>
        </operator>
        <operator name="AttributeSubsetPreprocessing" class="AttributeSubsetPreprocessing" expanded="yes">
            <parameter key="attribute_name_regex" value="a1|a2"/>
            <parameter key="condition_class" value="attribute_name_filter"/>
            <operator name="DecisionTree" class="DecisionTree">
                <parameter key="keep_example_set" value="true"/>
            </operator>
            <operator name="ModelApplier" class="ModelApplier" breakpoints="after">
            </operator>
            <operator name="ChangeAttributeRole" class="ChangeAttributeRole">
                <parameter key="name" value="prediction(label)"/>
            </operator>
        </operator>
    </operator>
    Greetings,
      Sebastian
Sign In or Register to comment.