The Altair Community is migrating to a new platform to provide a better experience for you. The RapidMiner Community will merge with the Altair Community at the same time. In preparation for the migration, both communities are on read-only mode from July 15th - July 24th, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here.
Options

Change the domain of an attribute

andkuo_7andkuo_7 Member Posts: 3 Contributor I
edited December 2018 in Help

Let's say I have a column 'Fruit' that can take the values 'a', 'b' or 'c':

 

ID, Fruit
1, a
2, b
3, c
4, c

 

and then I remove all 'Fruit' values that are equal to 'c' with a filter so that only 'a':s and 'b':s remain:

 

ID, Fruit
1, a
2, b

 

If I now go to Results -> Statistics, and look at the Values for the Fruit attribute, it will tell me

 

a(1), b(1), c(0).

 

This means the domain (i.e. the possible values the Fruit attribute can take, don't know what domain is called with RM nomenclature...) is [a, b, c]. How do I change the domain to be [a, b]? (I don't really need c anymore!)

 

My current workaround has been to write to a csv file after the filtering, and then read from this new csv file. But I suppose this is possible to do in a more elegant way with an operator...

 

Tagged:

Best Answer

  • Options
    IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Solution Accepted

    Hi,

     

    Yes, there is an operator for that indeed :-)  It is called "Remove Unused Values".  The process below shows a little example.

     

    Hope this helps,

    Ingo

     

    <?xml version="1.0" encoding="UTF-8"?><process version="9.0.002">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="9.0.002" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="retrieve" compatibility="9.0.002" expanded="true" height="68" name="Retrieve Titanic" width="90" x="45" y="34">
    <parameter key="repository_entry" value="//Samples/data/Titanic"/>
    </operator>
    <operator activated="true" class="filter_examples" compatibility="9.0.002" expanded="true" height="103" name="Filter Examples" width="90" x="179" y="34">
    <list key="filters_list">
    <parameter key="filters_entry_key" value="Passenger Class.is_in.First;Second"/>
    </list>
    </operator>
    <operator activated="true" class="remove_unused_values" compatibility="9.0.002" expanded="true" height="103" name="Remove Unused Values" width="90" x="313" y="34">
    <parameter key="attribute_filter_type" value="single"/>
    <parameter key="attribute" value="Passenger Class"/>
    </operator>
    <connect from_op="Retrieve Titanic" from_port="output" to_op="Filter Examples" to_port="example set input"/>
    <connect from_op="Filter Examples" from_port="example set output" to_op="Remove Unused Values" to_port="example set input"/>
    <connect from_op="Remove Unused Values" from_port="example set output" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>
Sign In or Register to comment.