Options

error on determine silhouette for k-means clustering

livnhnlivnhn Member Posts: 1 Contributor I
edited November 2018 in Help

Hello

I'm using rapidminer version 8.1

I want to calculate the silhouette value for k-means clustering

For this purpose
I downloaded the plugin from the following URL:
http://korek.name/web/moje-tvorba/rapidminer-clustering_performance_plugin-average_silhouette-cophenetic_coefficient

 

My process image is as follows:photo_2018-03-10_13-46-58.jpg

 

In this process, the following error occurs:Snap2.jpg

 

 

Does anyone know where the problem is?

 

thanks
regards

Answers

  • Options
    David_ADavid_A Administrator, Moderator, Employee, RMResearcher, Member Posts: 297 RM Research

    Hi,

     

    I don't know that specific extension.

    But from the error code my guess would be, that it might require a numerical cluster attribute. What you could try is to add a Nominal to Numerical  Operator after the clustered ExampleSet and change the cluster Attribute to unique integers (remember to check "include special attributes").

    No guarantee but it might do the trick.

     

    Best,
    David

     

  • Options
    sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    hi @livnhn- that looks like a VERY old extension. I would not find it surprising if it produced errors with RapidMiner 8. If you're just trying to do performance calculations with k-means clustering, there are several native operators that can help:

     

    Screen Shot 2018-03-12 at 9.51.47 AM.png

     

    Scott

     

  • Options
    Muhammed_Fatih_Muhammed_Fatih_ Member Posts: 93 Maven
    Hi @sgenzer

    which one of the natives ones would you recommend to evaluate the k for k-means? 

    Cheers!
  • Options
    lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
    Hi Muhammed_Fatih_,

    You mean determine a priori the best number of clusters k ?
    You can use the Performance (Cluster Distance Performance) operator and set the main criterion as Average within centroid distance in the parameters.
    Then you can use an Optimization loop to plot the Average within centroid distance according to k (the number of clusters).
    The method is explained in this thread : 

    https://community.rapidminer.com/discussion/comment/61654#Comment_61654

    Hope this helps,

    Regards,

    Lionel 


  • Options
    Muhammed_Fatih_Muhammed_Fatih_ Member Posts: 93 Maven
    Hi @lionelderkrikor

    yes, the a priori determination of clusters k. Thank you for the hint!
Sign In or Register to comment.