Joining two examplesets by Clusters.

Regular Contributor

Joining two examplesets by Clusters.


I have an exampleset on which kmeans is run and a variable 'Cluster' giving the cluster groupings in the data. There is also an ID variable in the same data.

I have an another exampleset, giving the cluster centroids. This data also has a variable called cluster, having the values same as the first exampleset, but each cluster appears only once. If there are 3 clusters, then there are only three examples here. This exampleset does NOT have an Id variable.

In both the examplesets, the type of the 'cluster' variable is regular.

Now, I want to join the above datasets by the 'cluster' variable. But the error i get is:"Input exampleset does not have an ID attribute". But i want to join the data by 'cluster' variable. How do i do it? Do i have to convert the regular type cluster variables to 'Id' variable in both the examplesets?

Thank you very much for your help,


Re: Joining two examplesets by Clusters.

If I understand you correctly: yes.

Why did you not simply try it out?


How to load processes in XML from the forum into RapidMiner: Read this!