The Altair Community is migrating to a new platform to provide a better experience for you. The RapidMiner Community will merge with the Altair Community at the same time. In preparation for the migration, both communities are on read-only mode from July 15th - July 24th, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here.
Options

Validation of k-means Clustering

tiramisusanntiramisusann Member Posts: 9 Contributor II
edited June 2019 in Help
Hi everybody,

I need to validate my k-means clustering by internal measures, but I am not quite sure how to do that.

First, I want to compute the Davies-Bouldin-Index and compare it to different k to choose the best k. But why DB-Index is negative? Does ist mean, that a DB of "-5" is better than a DB of "-1", as I have to choose the smallest DB-Index for an optimal clustering?

What other possibilities do I have in RM to check validity? I read about Sum-of-Squares, which I can obtain through "Item Distribution Performance". But I am not sure, if I am receiving the total or between or within Sum-of-Squares.

Does anybody know? I really would apprecciate your help and your ideas.

Best,
tiramisusann
Tagged:

Answers

  • Options
    awchisholmawchisholm RapidMiner Certified Expert, Member Posts: 458 Unicorn
    Hello tiramisusann

    This link might help...

    rapidminernotes.blogspot.com/search/label/Clustering

    The reason the values are negative is that some operators work by trying to maximise performance - a negative value that tends to 0 fits this requirement although in reality the absolute value is the one to use.

    regards,

    Andrew
Sign In or Register to comment.