Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

Change the domain of an attribute

andkuo_7andkuo_7 Member Posts: 3 Learner III
edited December 2018 in Help

Let's say I have a column 'Fruit' that can take the values 'a', 'b' or 'c':

 

ID, Fruit
1, a
2, b
3, c
4, c

 

and then I remove all 'Fruit' values that are equal to 'c' with a filter so that only 'a':s and 'b':s remain:

 

ID, Fruit
1, a
2, b

 

If I now go to Results -> Statistics, and look at the Values for the Fruit attribute, it will tell me

 

a(1), b(1), c(0).

 

This means the domain (i.e. the possible values the Fruit attribute can take, don't know what domain is called with RM nomenclature...) is [a, b, c]. How do I change the domain to be [a, b]? (I don't really need c anymore!)

 

My current workaround has been to write to a csv file after the filtering, and then read from this new csv file. But I suppose this is possible to do in a more elegant way with an operator...

 

Tagged:

Best Answer

  • IngoRMIngoRM Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Solution Accepted

    Hi,

     

    Yes, there is an operator for that indeed :-)  It is called "Remove Unused Values".  The process below shows a little example.

     

    Hope this helps,

    Ingo

     

    <?xml version="1.0" encoding="UTF-8"?><process version="9.0.002">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="9.0.002" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="retrieve" compatibility="9.0.002" expanded="true" height="68" name="Retrieve Titanic" width="90" x="45" y="34">
    <parameter key="repository_entry" value="//Samples/data/Titanic"/>
    </operator>
    <operator activated="true" class="filter_examples" compatibility="9.0.002" expanded="true" height="103" name="Filter Examples" width="90" x="179" y="34">
    <list key="filters_list">
    <parameter key="filters_entry_key" value="Passenger Class.is_in.First;Second"/>
    </list>
    </operator>
    <operator activated="true" class="remove_unused_values" compatibility="9.0.002" expanded="true" height="103" name="Remove Unused Values" width="90" x="313" y="34">
    <parameter key="attribute_filter_type" value="single"/>
    <parameter key="attribute" value="Passenger Class"/>
    </operator>
    <connect from_op="Retrieve Titanic" from_port="output" to_op="Filter Examples" to_port="example set input"/>
    <connect from_op="Filter Examples" from_port="example set output" to_op="Remove Unused Values" to_port="example set input"/>
    <connect from_op="Remove Unused Values" from_port="example set output" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>
Sign In or Register to comment.