Options

"Can I filter a dataset by the values of an attribute?"

el_chiefel_chief Member Posts: 63 Contributor II
edited May 2019 in Help
Hola,

Say I have an exampleset like this:

[tt]
ID    Name          Job
1      Neil          Researcher
2      Ralf          BizDev
3      Ingo          Grandmaster
4      Haddock      Researcher
[/tt]

Can I filter the exampleset in Rapidminer for only Job=Researcher?

If so, how?

Thanks

Neil
Tagged:

Answers

  • Options
    IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi Neil,

    yes, that's possible with the operator "Filter Examples". The parameter "condition class" has to be set to "attribute_value_filter". Then you can specify an expression like "Job = Researcher" for the parameter named "parameter string".

    Here is an example working on one of the sample data sets:

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.1.000">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.1.000" expanded="true" name="Process">
        <process expanded="true" height="116" width="279">
          <operator activated="true" class="retrieve" compatibility="5.1.000" expanded="true" height="60" name="Retrieve" width="90" x="45" y="30">
            <parameter key="repository_entry" value="//Samples/data/Labor-Negotiations"/>
          </operator>
          <operator activated="true" class="filter_examples" compatibility="5.1.000" expanded="true" height="76" name="Filter Examples" width="90" x="179" y="30">
            <parameter key="condition_class" value="attribute_value_filter"/>
            <parameter key="parameter_string" value="vacation = average"/>
          </operator>
          <connect from_op="Retrieve" from_port="output" to_op="Filter Examples" to_port="example set input"/>
          <connect from_op="Filter Examples" from_port="example set output" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    Hope that helps. Cheers,
    Ingo
  • Options
    el_chiefel_chief Member Posts: 63 Contributor II
    Thanks Ingo.

    Some other things I found:

    * spaces are allowed in value
    * you can use the | operator for "or"
    * attribute and value are both case sensitive
Sign In or Register to comment.