How to extract distinct features of K-Means Cluster?

eldenoso · August 2017

Hello altogether,

since I'm doing some cluster-analysis, I am mainly interested in the features of each cluster. How can each cluster be described by it's attributes?

When I think about a marketing-case, it's not enough to just cluster your customers. You also have to know how to treat each group, therefore you have to know what the main features are.

Is there a way to extract them from the K-Means algorithm or is there even a better approach to this?

Thanks in advance

kershov · August 2017

Hello!

I think Extract Cluster Prototypes operator can help you/

eldenoso · August 2017

Thank you for your answer @kershov!

But I think thats not exactly what I searched for, since the Prototypes don't really describe the clusters. E.g. when you plot the cluster you see that main group is in germany, but the prototype says it is norway, which seems contrary.

Is there another way to get features extracted? In a decision tree for example it is easier to identify the important features.

Thank you

Telcontar120 · August 2017

Hi there, you have a couple of options to this common question.

You could turn your clusters into labels and then attempt to diagnose them using predictive modeling algorithms, using simple classifiers such as Naive Bayes or Decision Trees.

If you already have labels (not the clusters themselves) then you could use "Map Clustering on Labels" and do something similar. Or run a predictive model using only the cluster attribute against your existing labels.

You can also use the centroid output from clusters to determine which attributes score highly for a given cluster but not for other clusters. You could even use "Generate Attributes" to define a new metric of the difference in centroid values between one cluster and another.

You might also want to search through the forum on this topic since there are many existing threads that are related, and they might give you even more ideas. Here's one, for example: http://community.rapidminer.com/t5/RapidMiner-Studio-Forum/Cluster-Performance-DBScan-and-agglomerative-Clustering/m-p/40754#M27689

I hope this helps!

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

How to extract distinct features of K-Means Cluster?

Answers