🎉 🎉 RAPIDMINER 9.10 IS OUT!!! 🎉🎉
Download the latest version helping analytics teams accelerate time-to-value for streaming and IIOT use cases.
I want to determine or rather apporximate the optimal k for k-means and tried to generate an Ellbow Plot. The generated plot looks like the following which confused me a bit:
It is not so easy to detect an optimal k due to the reason that more than one ellbows were generated and the graph doesn't look like a conventional Ellbow Plot. Where to set the k? At k=6, k=7 or k=14 where a brakdown is respectively marked with regard to the avg.centroid distance measure. Or would it be better to generate an Ellbow Plot by considering Davies Bouldin insetead of Centroid Distance?
The data is represented by TF-IDF values.
Thanks in advance for your hints!