What is the relation between cluster size and centroid table? Which model makes more sense? Why?

NatalySimthNatalySimth Member Posts: 8 Contributor II
Hello folks,

I am working on comparing two results and I have them as below:

My question is : What is the relation between cluster size and centroid table? Which model makes more sense? Why? 
 (Case1):



(Case 2):

TghadiallyJasmine_

Best Answer

Answers

  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 952   Unicorn
    edited October 2019
    Hi @NatalySimth,

    Without any additionnal informations, to have a general idea, you can calculate the Average within centroid distance which measure the "compacity" of the clusters.(to compare the 2 models).
    For that, you have to put a Performance (Cluster Distance Performance) operator at the end of your process.

    Edit : 
    I wanted to correct /complete the explanation above : 
    Assuming that you are using K-means algorithm, a method to find the best k (number of cluster(s)), and thus the best model, is to plot the "Average within centroid distance" according to "k". You will obtain a curve like that (or in the opposite direction since the Average within centroid distance are negative in RapidMiner): 



    The best k and thus the more relevant model matches with the inflexion point of the curve.

    Hope this helps,


    Regards,

    Lionel 
    sgenzervarunm1Jasmine_
  • NatalySimthNatalySimth Member Posts: 8 Contributor II
    Hey lionelderkrikor thanls for your explanation. if you allow me what do you mean with the "compacity" of the clusters?

    how can I create performance and Elbow? still new to all of these methods.
    Jasmine_
  • NatalySimthNatalySimth Member Posts: 8 Contributor II
    @lionelderkrikor Thanks a million! :) So useful information.
    lionelderkrikorvarunm1Jasmine_
  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 952   Unicorn
    @NatalySimth,

    You're welcome ! 

    Regards,

    Lionel
    NatalySimthJasmine_
  • Muhammed_Fatih_Muhammed_Fatih_ Member Posts: 42  Maven
    Hi @lionelderkrikor

    thank you for your inspiring answer from above! In this sense, it should be also possible to generate the Ellbow by using the Davies-Bouldin index in order to compare the main criterion, right? 

    Thank you in advance for your answer! 

    Regards! 
    Jasmine_
Sign In or Register to comment.