Measuring Cluster Validity by purity measures?

Fred12Fred12 Member Posts: 344 Unicorn
edited November 2018 in Help


I have a problem with 3 classes, and  wan to d k-means clustering, is there some way to assess cluster performance by some cluster purity criteria like Homogeneity, Entropy, or information gain?

is there some operator that does that?


Best Answer

  • Options
    Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    Solution Accepted

    Hi @Fred12, this seems quite similar to the question you posted on Thurs.  Is it different in some way?  I posted a reply there but I didn't see a response, you might want to check it out: Thursday's post reply


    In any event, the operator "Item Distribution Performance" lets you look at the overall Gini coefficient based on your label across your clusters, which seems similar to what you are asking to measure. 

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts


  • Options
    Fred12Fred12 Member Posts: 344 Unicorn

    yes sorry, I didnt find my post since then ;)

Sign In or Register to comment.