Options

# Problems with Auto Model Cluster Analysis

"I am using Auto Model to do a k-means cluster analysis. Works fine for 2
clusters. With 3 or more clusters or or more cluster has an average
distance of ? and a Davies-Bouldin index of infinity. This appeared
before and I thought Version 9.6 had fixed it but apparently not. It
also appears in the beta of 9.7. Is there a way around this? Thanks."

Tagged:

1

## Answers

1,195UnicornCan you share your data in order we can reproduce and understand what's going on ?

Regards,

Lionel

15University Professor1,195UnicornThank you for sharing your data.

I can reproduce what you observe :

But there is something strange in Auto-Model itself because

if I'm using your data (only the first four variables) with a k-Means model (with k = 3, 4,etc) in a classic RapidMiner process,

the results are correct (ie I obtain finite values for DB index and average distances) :

Has someone an idea of what's going on in Auto-Model (clustering) ?

In attached file, the classic (working) process in RapidMiner.

Regards,

Lionel

15University Professor1,195UnicornThe "real" distances are, of course, positive.

It seems to me that RapidMiner multiply the distances by minus one (-1) in order to work with negative values because

RapidMiner's algorithms are searching to MAXIMIZE these values. (explanation to be confirmed by the RM staff, @sgenzer ?)

Regards,

Lionel

15University Professor