Options

# input ncluster for k-means

kdamodaran
Member Posts:

**2**Contributor I
Hi all,

I am new to rapidminer. I am interested in applying k-means clustering for a dataset consisting of a few thousand elements, and the attributes are real valued. So, the standard, sum of squared distances to the centroid will work as the metric for convergence.

A couple of trials I have run using k-means just partitions the data into two clusters, which seems to be the default? How can specify the number of clusters?

Thanks,

Dam

I am new to rapidminer. I am interested in applying k-means clustering for a dataset consisting of a few thousand elements, and the attributes are real valued. So, the standard, sum of squared distances to the centroid will work as the metric for convergence.

A couple of trials I have run using k-means just partitions the data into two clusters, which seems to be the default? How can specify the number of clusters?

Thanks,

Dam

0

## Answers

106MavenClick on the k-Means operator box in the process and set k in the Parameters window to the desired value.

BTW, the convergence of the algorithm is given by the fact that the centroids do not change in two consecutive

iterations. Regarding the sum of squared distances (i.e. the squared error), it provides a criterion to select the best solution among the generated possibly multiple solutions.

Regards,

Dan

2Contributor IOn a related note, is it possible to retain the nominal ids of the elements being processed. Sure, we can always drop the clustering output into excel and match with original ids but ............

Thanks for your help!

Dam

2,531Unicornmight be you have deactivated the according view. Go to the menu View, select Show View and then Parameters if not already selected.

For more information about RapidMiner's gui and the concepts in general I would suggest you take a look at the Manual that's available in english and german.

Greetings,

Sebastian