Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

"How to change role for attributes specified by regex?"

keithkeith Member Posts: 157 Maven
edited May 2019 in Help
I have a group of attributes I want to change to a custom "donotuse" role.  This is for data that should not be used in the model, isn't the ID column, but that I want to still keep associated with the predictions and results for other purposes.

What I've tried to do is specify a regular expression that identifies the attributes, and have a single ChangeAttributeRole nested inside.  The FeatureIterator seemed like the right choice:

        <operator name="FeatureIterator" class="FeatureIterator" expanded="yes">
            <parameter key="filter" value="(name|address|city|state|zip)"/>
            <operator name="ChangeAttributeRole" class="ChangeAttributeRole">
                <parameter key="name" value="%{loop_feature}"/>
                <parameter key="target_role" value="donotuse"/>
            </operator>
        </operator>
But that only changes the role for the duration of that loop iteration.  AttributeSubsetPreprocessing specifically says that role change are not preserved, so that's out.  What's the right way to do it?

Thanks,
Keith
Tagged:

Answers

  • steffensteffen Member Posts: 347 Maven
    Hello

    Just a few remarks:
    First: I tried to do this using the AttributeSubsetPreprocessing...which worked
    <operator name="Root" class="Process" expanded="yes">
        <operator name="ExampleSource" class="ExampleSource">
            <parameter key="attributes" value="\golf.aml"/>
        </operator>
        <operator name="AttributeSubsetPreprocessing" class="AttributeSubsetPreprocessing" expanded="yes">
            <parameter key="attribute_name_regex" value="Outlook|Temperature"/>
            <parameter key="deliver_inner_results" value="true"/>
            <operator name="ChangeAttributeRole" class="ChangeAttributeRole">
                <parameter key="name" value="Outlook"/>
                <parameter key="target_role" value="donotuse"/>
            </operator>
            <operator name="ChangeAttributeRole (2)" class="ChangeAttributeRole">
                <parameter key="name" value="Temperature"/>
                <parameter key="target_role" value="donotuse2"/>
            </operator>
        </operator>
    </operator>
    I admit that AttributSubsetProcessing is useless here...
    However, giving the both roles the same name, did not work (change "donotuse2" to "donotuse" in the setup above). I guess it is a rule that a role can occur just once in an ExampleSet. In this case we need a special-default role, indeed or a switchon-switchoff-operator ;)

    Second: It is possible to build a workaround with ParameterIteration, ExampleSetJoin etc ( I love the modelling abilities of RM)... but in this case you are better of changing every attribute "manually"

    greetings

    Steffen
  • IngoRMIngoRM Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi,

    we just have added a new parameter to the "FeatureIterator" operator named "work_on_input". If you set this parameter to false, the output of the last loop run will be used as input for the next loop and finally returned. Here is an example:

    <operator name="Root" class="Process" expanded="yes">
        <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
            <parameter key="target_function" value="sum"/>
        </operator>
        <operator name="FeatureIterator" class="FeatureIterator" expanded="yes">
            <parameter key="filter" value="att2|att3"/>
            <parameter key="work_on_input" value="false"/>
            <operator name="ChangeAttributeRole" class="ChangeAttributeRole">
                <parameter key="name" value="%{loop_feature}"/>
                <parameter key="target_role" value="donotuse_%{a}"/>
            </operator>
        </operator>
    </operator>
    Cheers,
    Ingo
  • keithkeith Member Posts: 157 Maven
    Great! Thanks Ingo.  Does that mean that the new parameter is available if I download and build the latest CVS version?
  • IngoRMIngoRM Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi,

    yes, it is. Basic information about downloading and installing the CVS version can be found here:

    http://rapid-i.com/content/view/25/48/


    Please note, however, that we have introduced a new developer branch into CVS named "Zaniah". So you will no longer have to checkout the "HEAD" branch but the branch "Zaniah" in order to get the latest features. The HEAD branch is now our bugfix branch where only bugfixed but no new feature will be added.

    Cheers,
    Ingo

Sign In or Register to comment.