Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

Validation of k-means Clustering

tiramisusanntiramisusann Member Posts: 9 Contributor II
edited June 2019 in Help
Hi everybody,

I need to validate my k-means clustering by internal measures, but I am not quite sure how to do that.

First, I want to compute the Davies-Bouldin-Index and compare it to different k to choose the best k. But why DB-Index is negative? Does ist mean, that a DB of "-5" is better than a DB of "-1", as I have to choose the smallest DB-Index for an optimal clustering?

What other possibilities do I have in RM to check validity? I read about Sum-of-Squares, which I can obtain through "Item Distribution Performance". But I am not sure, if I am receiving the total or between or within Sum-of-Squares.

Does anybody know? I really would apprecciate your help and your ideas.

Best,
tiramisusann
Tagged:

Answers

  • awchisholmawchisholm RapidMiner Certified Expert, Member Posts: 458 Unicorn
    Hello tiramisusann

    This link might help...

    rapidminernotes.blogspot.com/search/label/Clustering

    The reason the values are negative is that some operators work by trying to maximise performance - a negative value that tends to 0 fits this requirement although in reality the absolute value is the one to use.

    regards,

    Andrew
Sign In or Register to comment.