Unable to filter by cluster

vme64vme64 Member Posts: 10 Contributor II
edited November 2018 in Help
Hello,

  I am trying to apply the clustering operator and then keep only the examples that belong to a specific cluster (using FilterExamples). But it does not work. I crafted a small example to reproduce the problem. It seems to be a bug, but I ask just in case I am doing something silly... Any help appreciated.

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.009">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.1.009" expanded="true" name="Process">
    <process expanded="true" height="351" width="871">
      <operator activated="true" class="generate_sales_data" compatibility="5.1.009" expanded="true" height="60" name="Generate Sales Data" width="90" x="45" y="30">
        <parameter key="use_local_random_seed" value="true"/>
      </operator>
      <operator activated="true" class="select_attributes" compatibility="5.1.009" expanded="true" height="76" name="Select Attributes" width="90" x="179" y="30">
        <parameter key="attribute_filter_type" value="single"/>
        <parameter key="attribute" value="single_price"/>
      </operator>
      <operator activated="true" class="k_means" compatibility="5.1.009" expanded="true" height="76" name="Clustering" width="90" x="313" y="30"/>
      <operator activated="true" class="filter_examples" compatibility="5.1.009" expanded="true" height="76" name="Filter Examples" width="90" x="447" y="30">
        <parameter key="condition_class" value="attribute_value_filter"/>
        <parameter key="parameter_string" value="cluster = &quot;cluster_0&quot;"/>
      </operator>
      <connect from_op="Generate Sales Data" from_port="output" to_op="Select Attributes" to_port="example set input"/>
      <connect from_op="Select Attributes" from_port="example set output" to_op="Clustering" to_port="example set"/>
      <connect from_op="Clustering" from_port="clustered set" to_op="Filter Examples" to_port="example set input"/>
      <connect from_op="Filter Examples" from_port="example set output" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>
Best regards,

  Vinicius

Answers

  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi,

    the reason is the filter parameter: just remove the " and empty spaces and it will work like a charm:

    cluster=cluster_0

    Here is the corrected process:

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.1.011">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.1.011" expanded="true" name="Process">
        <process expanded="true" height="351" width="871">
          <operator activated="true" class="generate_sales_data" compatibility="5.1.011" expanded="true" height="60" name="Generate Sales Data" width="90" x="45" y="30">
            <parameter key="use_local_random_seed" value="true"/>
          </operator>
          <operator activated="true" class="select_attributes" compatibility="5.1.011" expanded="true" height="76" name="Select Attributes" width="90" x="179" y="30">
            <parameter key="attribute_filter_type" value="single"/>
            <parameter key="attribute" value="single_price"/>
          </operator>
          <operator activated="true" class="k_means" compatibility="5.1.011" expanded="true" height="76" name="Clustering" width="90" x="313" y="30"/>
          <operator activated="true" class="filter_examples" compatibility="5.1.011" expanded="true" height="76" name="Filter Examples" width="90" x="447" y="30">
            <parameter key="condition_class" value="attribute_value_filter"/>
            <parameter key="parameter_string" value="cluster=cluster_0"/>
          </operator>
          <connect from_op="Generate Sales Data" from_port="output" to_op="Select Attributes" to_port="example set input"/>
          <connect from_op="Select Attributes" from_port="example set output" to_op="Clustering" to_port="example set"/>
          <connect from_op="Clustering" from_port="clustered set" to_op="Filter Examples" to_port="example set input"/>
          <connect from_op="Filter Examples" from_port="example set output" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    Cheers,
    Ingo
  • vme64vme64 Member Posts: 10 Contributor II
    Thanks!
Sign In or Register to comment.