X-Means always same behavior

nelsonthekinger · June 2014

Hello Experts!

I'm trying to use X-means due to its advantages against K-means, but Im not getting the proper result.
I tried K-means to evaluate 6 files from 3 categories and with a k = 3 it worked perfectly.
than i try to apply the Xmeans from 2 to 60 and I get always 2 clusters.

I though it could be because of having few files so I tried again with 53 files from 3 categories,
and the result were the same. K-means(k=3) successful, X-means (k = 2 - 60) the same 2 clusters.

I've tried many configurations but the most use are
measure type: NumericalMeasures
numerical measure: CosineSimilarity
clustering algoritm: KMeans
the rest is default.

I'm Clueless about the reason any help is appreciate!

nelsonthekinger · July 2014

Anyone?

bigbangtwo · June 2015

Hello!
I have the same problem. I tried x-means with kmin=2 and kmax=60 and for my data the right result is 4 klusters, xmeans worked and give a result - 2 klusters. And the same result for different data that i tried.
Who can help me?)

Muhammed_Fatih_ · May 2020

Hello everyone,

i tried X-Means between the interval k-min=2 and k-max=60 as well as with k-min=20 and k-max=60. The x-means model gives me the minimal number of k (in the first time k=2 and in the second time k=20) in each time. Is it normal that x-Means always picks the minimal number of k?

Best regards!

MartinLiebig · May 2020

hi,

did you normalize before?

Best,

Martin

mantanz · May 2020

If possible please share your xml and let me know the number of examples in your data set.

The situation you stated can happen if you don't have too many examples for clustering, or they are simply too similar to one another so the X-means always resorts to the simplest clustering scheme.
In such case it is better to normalize the data beforehand. This will ensure all the attributes arrives at the same scale before the algorithm is applied.
For e.g. attribute1 has data range 0-100 and attribute2 has vector range 0-1. Now in this case attribute1 gets more weightage than attribute2. But if you apply normalise both attributes will covert to 0-1 scale.

Rapidminer Operator to be used : "Normalize"

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

X-Means always same behavior

Answers