The RapidMiner community is on read-only mode until further notice. Technical support via cases will continue to work as is. For any urgent licensing related requests from Students/Faculty members, please use the Altair academic forum here.

# What is the relation between cluster size and centroid table? Which model makes more sense? Why?

Member Posts: 8 Contributor I
Hello folks,

I am working on comparing two results and I have them as below:

My question is : What is the relation between cluster size and centroid table? Which model makes more sense? Why?
(Case1):

(Case 2):

Tagged:

• RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
edited October 2019
Hi @NatalySimth,

Without any additionnal informations, to have a general idea, you can calculate the Average within centroid distance which measure the "compacity" of the clusters.(to compare the 2 models).
For that, you have to put a Performance (Cluster Distance Performance) operator at the end of your process.

Edit :
I wanted to correct /complete the explanation above :
Assuming that you are using K-means algorithm, a method to find the best k (number of cluster(s)), and thus the best model, is to plot the "Average within centroid distance" according to "k". You will obtain a curve like that (or in the opposite direction since the Average within centroid distance are negative in RapidMiner):

The best k and thus the more relevant model matches with the inflexion point of the curve.

Hope this helps,

Regards,

Lionel
• Member Posts: 8 Contributor I
Hey lionelderkrikor thanls for your explanation. if you allow me what do you mean with the "compacity" of the clusters?

how can I create performance and Elbow? still new to all of these methods.
• Member Posts: 8 Contributor I
@lionelderkrikor Thanks a million! So useful information.
• RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
@NatalySimth,

You're welcome !

Regards,

Lionel
• Member Posts: 93 Maven
Hi @lionelderkrikor

thank you for your inspiring answer from above! In this sense, it should be also possible to generate the Ellbow by using the Davies-Bouldin index in order to compare the main criterion, right?