"Clustering with KMeans"

cpc2cpc2 Member Posts: 18 Maven
edited May 2019 in Help
Hi,
I have a question about the Weka W-SimpleKMeans algorithm. When I use the operator, is there anywhere in the result mode a detailed description of the clusters which are found during the analysis ? In the Text Mode Windows theres just the number of clusters and the item number which belongs to it.
When I use SimpleKMeans in Weka there is a detailed description of each cluster with the attribute values. Is there anything like that in the Rapidminer Version ?
Thanks in advance, Birger.
Tagged:

Answers

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    if you replace the Weka version by the RapidMiner Operator KMeans, you will have details about your centroids.

    Greetings,
      Sebastian
  • cpc2cpc2 Member Posts: 18 Maven
    Hi,
    the problem with the RapidMiner KMeans is, that it can only handle numerical attributes, while the Weka version can also handle nominal ones. Is there any other solution for this problem ?
    Thanks in advance, Birger
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    you cannot apply KMeans on nominal values. This will not work correctly, because KMeans implicitly always uses the euclidean distance between examples. And this distance is simply not defined for the difference between nominal values like apples and eggs.
    You might switch to KMedoids and use one of the mixed measures for calculating the distance, or you could transform nominal values into numerical ones in a reasonable manner beforehand. What's reasonable depends mainly on the data and it's meanings, so automatic conversions like Weka does, cannot be always reasonable.
    RapidMiner provides several operators for this transformations like Nominal2Binominal or Nominal2Numerical. Take a look at them and think how to represent your nominal values by numeric values, which will somehow reflect an ordering or a weighting of importance.

    Greetings,
      Sebastian
  • cpc2cpc2 Member Posts: 18 Maven
    Hi,
    thanks for the effort, I will take a closer look at the operators and what i'm trying to analyse. ;)
    Birger
  • cpc2cpc2 Member Posts: 18 Maven
    Ok, i'm using KMedoids now with nominal attributes and Mixed Measures. My problem is that on the Result Mode in the description of the Medoids the nominal attributes of the Medoids are described as numbers. Is it possible to display the "real" nominal values or convert to the nominal values ? 
    Furthermore, i would like to write the description of the Cluster Medoids to a file, but i haven't found any IO Operator for this ?
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    I'm sorry, but I think this isn't possible yet.

    Greetings,
      Sebastian
  • cpc2cpc2 Member Posts: 18 Maven
    Ok, Thanks for the info.

    Birger.
Sign In or Register to comment.