Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

"Rank order of attributes to each cluster"

doronadorona Member Posts: 1 Learner III
edited May 2019 in Help
Hi,
I just now started to play around with clustering and using Rapid Miner I was able to get results. Now my problem is how to categorize each cluster. Is there a way to get out of Rapid Miner for each cluster a ranked ordered list of attributes that best describe each cluster?
In addition, it would be great to have an actual value of contribution to the model and a statistic to measure its statistical significance as well.

Thanks 
Tagged:

Answers

  • IngoRMIngoRM Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi,

    yes, this is possible with RapidMiner. After clustering, each example in the input data set gets a cluster id assigned. Now you could use the new operator "AttributeConstruction" (will replace the operator FeatureGeneration in future releases together with the new ValueIterator operator). The whole setup looks like this:

    <operator name="Root" class="Process" expanded="yes">
        <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
            <parameter key="number_examples" value="200"/>
            <parameter key="number_of_attributes" value="10"/>
            <parameter key="target_function" value="gaussian mixture clusters"/>
        </operator>
        <operator name="IdTagging" class="IdTagging">
        </operator>
        <operator name="KMeans" class="KMeans">
            <parameter key="k" value="5"/>
        </operator>
        <operator name="IOConsumer" class="IOConsumer">
            <parameter key="io_object" value="ClusterModel"/>
        </operator>
        <operator name="ValueIterator" class="ValueIterator" expanded="yes">
            <parameter key="attribute" value="cluster"/>
            <operator name="OperatorChain" class="OperatorChain" expanded="yes">
                <operator name="AttributeConstruction" class="AttributeConstruction">
                    <list key="function_descriptions">
                      <parameter key="inner_label_%{loop_value}" value="if (cluster == &quot;%{loop_value}&quot;, &quot;%{loop_value}&quot;, &quot;other&quot;)"/>
                    </list>
                </operator>
                <operator name="ChangeAttributeRole" class="ChangeAttributeRole">
                    <parameter key="name" value="inner_label_%{loop_value}"/>
                    <parameter key="target_role" value="label"/>
                </operator>
                <operator name="Relief" class="Relief">
                </operator>
                <operator name="IOConsumer (2)" class="IOConsumer">
                    <parameter key="io_object" value="ExampleSet"/>
                </operator>
            </operator>
        </operator>
    </operator>
    Please note that you will have to use the latest CVS version of RapidMiner or you would have to wait until the next release in order to get access to the latest version containing both new operators. It's by the way also possible with older versions but the process is much more complicated then.

    Cheers,
    Ingo
Sign In or Register to comment.