# What is the relation between cluster size and centroid table? Which model makes more sense? Why?

Member
I am working on comparing two results and I have them as below:

What is the relation between cluster size and centroid table? Which model makes more sense? Why?
Member
edited October 2019
Hi @NatalySimth,

Without any additionnal informations, to have a general idea, you can calculate the Average within centroid distance which measure the "compacity" of the clusters.(to compare the 2 models).
For that, you have to put a Performance (Cluster Distance Performance) operator at the end of your process.

Edit :
I wanted to correct /complete the explanation above :
Assuming that you are using K-means algorithm, a method to find the best k (number of cluster(s)), and thus the best model, is to plot the "Average within centroid distance" according to "k". You will obtain a curve like that (or in the opposite direction since the Average within centroid distance are negative in RapidMiner):

The best k and thus the more relevant model matches with the inflexion point of the curve.

Lionel
Member
Hey lionelderkrikor thanls for your explanation. if you allow me what do you mean with the "compacity" of the clusters?

how can I create performance and Elbow? still new to all of these methods.
Member
@lionelderkrikor Thanks a million! So useful information.
Member
@NatalySimth,

Lionel
Member
Hi @lionelderkrikor

thank you for your inspiring answer from above! In this sense, it should be also possible to generate the Ellbow by using the Davies-Bouldin index in order to compare the main criterion, right?