K-Means initialization (newbie question)
being a newbie to both data mining and Rapidminer I run into a question with regards to the k-means clustering algorithm.
First of all, from my theory book (Introduction to datamining by Tan & Steinbach) I learned that choosing the initial centroids for k-means is essential for the success of the algorithm. With the examples in the book, I understand that. But I would like to learn how to do this in practice using rapidminer.
Is there a way to set the initial centroids? I don't see any attribute for it on the k-means component.
Am I misunderstanding theory? Is this something rapidminer just doesn't support (and thus does random initialization)
Another question is whether RapidMiner is using Euclidean distance, Manhattan distance or another distance algorithm (and if this can be influenced?)