Validation of k-means Clustering

tiramisusanntiramisusann Member Posts: 9 Contributor II
edited June 2019 in Help
Hi everybody,

I need to validate my k-means clustering by internal measures, but I am not quite sure how to do that.

First, I want to compute the Davies-Bouldin-Index and compare it to different k to choose the best k. But why DB-Index is negative? Does ist mean, that a DB of "-5" is better than a DB of "-1", as I have to choose the smallest DB-Index for an optimal clustering?

What other possibilities do I have in RM to check validity? I read about Sum-of-Squares, which I can obtain through "Item Distribution Performance". But I am not sure, if I am receiving the total or between or within Sum-of-Squares.

Does anybody know? I really would apprecciate your help and your ideas.

Best,
tiramisusann
Tagged:

Answers

  • awchisholmawchisholm RapidMiner Certified Expert, Member Posts: 458 Unicorn
    Hello tiramisusann

    This link might help...

    rapidminernotes.blogspot.com/search/label/Clustering

    The reason the values are negative is that some operators work by trying to maximise performance - a negative value that tends to 0 fits this requirement although in reality the absolute value is the one to use.

    regards,

    Andrew
Sign In or Register to comment.