Options

# How to evaluate clustering

Hello

I want to compare clusters and evaluate which operators should I use?

And

How do I find the optimal parameters for each clustering method?

Thanks

0

💬0 Comments | 🔥0 Discussions | 👤0 Members | 🔌0 Online |

Options

I want to compare clusters and evaluate which operators should I use?

And

How do I find the optimal parameters for each clustering method?

Thanks

0

## Answers

297RM ResearchHi,

finding optimal settings for clustering is indeed a bit tricky.

But RapidMiner offers performance measures for clustering or segmentation tasks.

In the Operator list under

Validation -> Segmentationyou'll find the corresponding Operators.If you have a subset of your data, where you exactly know into which cluster each example belongs, you can also try to set the cluster Attribute as a prediction and optimize the classification performance instead.

Best,

David

69Contributor IHello

Concept of

avg within centroid distance -1.0876

davies bouldin -5.675

What is?

69Contributor II used Silhouette

What do these results show?

Please guide

Thanks

297RM ResearchHi again,

I guess the Silhoutte performance comes from a 3rd party extension, so I can't say much about it. But wikipedia has an entry about it:

https://en.wikipedia.org/wiki/Silhouette_(clustering)

In short it messaures how similar an Example is to the rest of the cluster. The value is normed between -1 and +1 and a high value indicates a higher similarity.

The Davies–Bouldin criterion is also quite good explained in wikipedia:

https://en.wikipedia.org/wiki/Davies%E2%80%93Bouldin_index

The idea is to maximise the inter-cluster distance (the different between the different clusters) and minimize inter-cluster distances (the points within each cluster should be close together). Here a lower index is better.

Best,

David

69Contributor IHello

Many thanks

Criterion

AVG within centroid distance -1.043

What is?

What does the Silhouette of each cluster show in the first photo?