🦉 🎤   RapidMiner Wisdom 2020 - CALL FOR SPEAKERS   🦉 🎤

We are inviting all community members to submit proposals to speak at Wisdom 2020 in Boston.


Whether it's a cool RapidMiner trick or a use case implementation, we want to see what you have.
Form link is below and deadline for submissions is November 15. See you in Boston!

CLICK HERE TO GO TO ENTRY FORM

input ncluster for k-means

kdamodarankdamodaran Member Posts: 2 Contributor I
edited November 2018 in Help
Hi all,
I am new to rapidminer. I am interested in applying k-means clustering for a dataset consisting of a few thousand elements, and the attributes are real valued. So, the standard, sum of squared distances to the centroid will work as the metric for convergence.

A couple of trials I have run using k-means just partitions the data into two clusters, which seems to be the default? How can specify the number of clusters?

Thanks,
Dam

Answers

  • dan_agapedan_agape Member Posts: 106  Guru
    Hi,

    Click on the k-Means operator box in the process and set k in the Parameters window to the desired value.

    BTW, the convergence of the algorithm is given by the fact that the centroids do not change in two consecutive
    iterations. Regarding the sum of squared distances (i.e. the squared error), it provides a criterion to select the best solution among the generated possibly multiple solutions.

    Regards,
    Dan
  • kdamodarankdamodaran Member Posts: 2 Contributor I
    dan_agape wrote:

    Hi,

    Click on the k-Means operator box in the process and set k in the Parameters window to the desired value.

    BTW, the convergence of the algorithm is given by the fact that the centroids do not change in two consecutive
    iterations. Regarding the sum of squared distances (i.e. the squared error), it provides a criterion to select the best solution among the generated possibly multiple solutions.

    Regards,
    Dan
    That's what I was expecting too. But I don't get a Parameter window. Am I not seeing that's totally obvious?! The only thing that seems close in the dialog box is "Show Operator Info", which also doesn't have a parameter window.
    On a related note, is it possible to retain the nominal ids of the elements being processed. Sure, we can always drop the clustering output into excel and match with original ids but ............
    Thanks for your help!
    Dam
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,527   Unicorn
    Hi,
    might be you have deactivated the according view. Go to the menu View, select Show View and then Parameters if not already selected.
    For more information about RapidMiner's gui and the concepts in general I would suggest you  take a look at the Manual that's available in english and german.

    Greetings,
      Sebastian
Sign In or Register to comment.