Filter Examples With Multiple Valued Attriburtes ex: Cluster_1 and Cluster_2

dragoljubdragoljub Member Posts: 241 Contributor II
edited October 2019 in Help
Hi Guys,

Often after cluster analysis I have multiple clusters of interest. What is the easiest way to select all examples from cluster_1 and cluster_2. Ideally I would use filter examples for 'cluster=1 or cluster=2' but that does not work for me.

-Gagi

Answers

  • haddockhaddock Member Posts: 849 Maven
    Please note your can define a logical OR of several conditions with ||
    so
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.0">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.0.0" expanded="true" name="Root">
        <parameter key="logverbosity" value="warning"/>
        <process expanded="true" height="406" width="835">
          <operator activated="true" class="retrieve" compatibility="5.0.0" expanded="true" height="60" name="Retrieve" width="90" x="45" y="30">
            <parameter key="repository_entry" value="../../data/Iris"/>
          </operator>
          <operator activated="true" class="k_means" compatibility="5.0.0" expanded="true" height="76" name="KMeans" width="90" x="180" y="30">
            <parameter key="k" value="3"/>
          </operator>
          <operator activated="true" class="apply_model" compatibility="5.0.0" expanded="true" height="76" name="ClusterModel2ExampleSet" width="90" x="315" y="30">
            <list key="application_parameters"/>
          </operator>
          <operator activated="true" class="filter_examples" compatibility="5.0.8" expanded="true" height="76" name="Filter Examples" width="90" x="514" y="30">
            <parameter key="condition_class" value="attribute_value_filter"/>
            <parameter key="parameter_string" value="cluster=cluster_0||cluster=cluster_1"/>
          </operator>
          <connect from_op="Retrieve" from_port="output" to_op="KMeans" to_port="example set"/>
          <connect from_op="KMeans" from_port="cluster model" to_op="ClusterModel2ExampleSet" to_port="model"/>
          <connect from_op="KMeans" from_port="clustered set" to_op="ClusterModel2ExampleSet" to_port="unlabelled data"/>
          <connect from_op="ClusterModel2ExampleSet" from_port="labelled data" to_op="Filter Examples" to_port="example set input"/>
          <connect from_op="Filter Examples" from_port="example set output" to_port="result 1"/>
          <connect from_op="Filter Examples" from_port="original" to_port="result 2"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
          <portSpacing port="sink_result 3" spacing="0"/>
        </process>
      </operator>
    </process>
  • dragoljubdragoljub Member Posts: 241 Contributor II
    WOW that was super easy! Thanks Haddock! ;D

    For everyone else who did not realize logical expressions are allowed in the filter examples operator:

    Filter Examples-> Attribute_value_filter-> parameter string = cluster_0||cluster=cluster_1

    -Gagi
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    yes, sometimes everything becomes surprisingly easy if you read the documentation...

    Greetings,
      Sebastian
  • dragoljubdragoljub Member Posts: 241 Contributor II
    The help section for 'filter examples' mentions that you can use logical 'or', 'and' operators but in the example section of the attribute value filter it just has:

    "Parameter string for the condition, e.g. 'attribute=value' for the AttributeValueFilter."

    For someone who looks at a lot of documentation it seems this would be a perfect place to add another example 'attribute=value1||attribute=value2'. Or by the example mention that other operators such as && || can be used. A clear discussion of white space is also missing and often times I find that white space matters. When exploring different logical operators you never really know if you are entering wrong code or if white space is messing up your process.

    I know these are small picky details but for this package to gain even more success these are the growing pains that documentation (and users) must go through.  ;D

    -Gagi
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi Gagi,
    if you see it that way, you will be really pleased to hear, that the RapidMiner Operator's documentation will be copied to our wiki soon. (In fact it's already there, but missing line breaks make it a pain to edit...). Then you will be able to add description as you want and contribute to other users and yourself. We will retrieve the documentation from the wiki and bundle it again with RapidMiner for each update.
    We hope to make it easy enough, so that the operator documentation is finally updated.

    Greetings,
      Sebastian
  • dragoljubdragoljub Member Posts: 241 Contributor II
    That's great news Sebastian! If its easy to contribute I'm sure people will add/edit the documentation to help everyone!  ;D

    -Gagi
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    well, I don't actually bet on this, but let's hope the best...

    Greetings,
      Sebastian
Sign In or Register to comment.