"Add cluster attribute" with AgglomerativeClustering

earmijoearmijo Member Posts: 270 Unicorn
I am using the AgglomerativeClustering operator to cluster a small dataset. Afterwards I apply the FlattenClusterModel but I don't get an option to "add cluster attribute". Is there an easy way of doing this? (All other clustering algorithms have as a standard output the creation of a cluster variable).

Answers

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    this is easyly possible if you use the ClusterModel2ExampleSet operator afterwards:
    <operator name="Root" class="Process" expanded="yes">
        <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
            <parameter key="number_examples" value="1000"/>
            <parameter key="number_of_attributes" value="2"/>
            <parameter key="target_function" value="three ring clusters"/>
        </operator>
        <operator name="AgglomerativeClustering" class="AgglomerativeClustering">
        </operator>
        <operator name="FlattenClusterModel" class="FlattenClusterModel">
        </operator>
        <operator name="ClusterModel2ExampleSet" class="ClusterModel2ExampleSet">
        </operator>
    </operator>
    Greetings,
      Sebastian
  • earmijoearmijo Member Posts: 270 Unicorn
    Thanks a lot Sebastian. It works perfectly. Just out of curiosity: Is there a reason why for this algorithm in particular one has to break the process in 2-3 steps (as opposed to just one with the other clusterers)?
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    thats easy to explain:
    Agglomerative Clustering does not assign an example exactly to one cluster. Instead each examples is its own cluster and then this cluster are merged until only one is left. So examples are part of many (up to n-1) clusters.
    The dendogram visualizes this.

    So you need to cut this tree of clusters at one part, but seeing the dendogram first gives you an impression where to cut. This is the second operator.

    Since we want to keep the operators somehow elementar, we have not mixed the cutting of the tree (or flatten of the hierarchy) and the applying on the dataset. So the applying, which is needed anyway, is the third operator.

    Greetings,
      Sebastian
Sign In or Register to comment.