Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

Measuring clustering quality for a previously clustered data

singing_bird_1singing_bird_1 Member Posts: 16 Contributor I
edited December 2018 in Help

Hi all, I am new in rapidminer and I have a clustered data that has been clustered previously  and I want to load this data with its lable to rapidminer to be evaluated using one of the clustering evaluation measures

Note: I don't want to recluster my data, I want to evaluate it as it is with its lables.

How can I do this?

Thanks in advance

Answers

  • FBTFBT Member Posts: 106 Unicorn

    Edit: I misread your question. Would you be able to post your data, or parts of it? Measuring the performance should be straightforward, as long as labels and relevant attributes are available. 

  • singing_bird_1singing_bird_1 Member Posts: 16 Contributor I

    thanks for you reply

    attached is a part of the data and its clusters

    they are 3 clusters

    the problem is that the performance (SSE) requires the data (which is not the problem) and requires the centroid which is unknown, because it is already labeled.

    silhouette requires the data , the model or the centroid as well as the similarity measure

    how can i arrange the nodes in the process to get the quality of the given data? and which nodes should i use?

     

  • singing_bird_1singing_bird_1 Member Posts: 16 Contributor I
  • FBTFBT Member Posts: 106 Unicorn

    Ok, I don't think you can make any meaningful performance evaluations like this, because the data is missing information (e.g. the cluster model). What would you like to achieve? I.e. what is the question about the clusters that you would like to have answered?

  • singing_bird_1singing_bird_1 Member Posts: 16 Contributor I

    my question is how to achieve the clustering quality despite the missed info as cluster model and distance measure in silhouette?

    if the answer is : it is impossible to achieve the clustering quality here in rapidminer because of the missing info, so give me a way to measure the clustering quality via another program or give me the SSE and the silhouette code

Sign In or Register to comment.