"Clustering Performance"

devl82devl82 Member Posts: 4 Contributor I
edited June 2019 in Help

I m trying to perform image segmentation using rapidminer's clustering algorithms. Except K-means, who completes execution in aproximatelly 3-4 minutes,  other methods (EM, k-medoids, Kernel k-means) never seem to converge (although on a Q6600 with 2GB, rapidminer never uses more than 30% of my cpu).

My data are simple features derived from pixels such as texture, magnitude, gradient etc all normalized to 0-1 (for each 300x400 image, a 300x400x3 feature matrix is extracted).

Do i need more powerfull cpu/memory or some kind of different normalization/preprocessing specifically for these algorithms??

Thnk you & sorry for the long msg (O>o) 


  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    I don't think the problem is your computer, but compared to K-Means all other flat cluster methods take a factor equal to your number of examples longer to converge. That's because K-Means utilizes some neat properties of the euclidean distance measure to be faster.

    So you might buy a faster computer, that would speed up the calculation a bit, but as you see on the workload, most of your cores are just doing nothing. So instead of buying a faster computer it would be more efficient to give us the money and let us implement a multi threaded version of the algorithms, so that it runs parallel. This would give you a speedup of factor 3 on your machine.

    Another possibility would be to reduce the dimensionality of the examples for example using a PCA.

Sign In or Register to comment.