Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
Clustering with labels?
Hi,
is there any way to do clustering with labels to control performance (in classification)? what operator can I use to do that (e.g with k-means?)
and is there some way to cluster the data with the "help" from labels if the class is known, so I mean clustering based on given labels (e.g find out which class label is clustered together, and then get the centroid of that local cluster and so on... ?)
Is there some operator existent that uses labels for clustering? I just want to find out some more properties about my dataset and my classes (e.g local cluster labels centroid tables... etc.)
Tagged:
0
Best Answer
-
MartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,529 RM Data Scientist
did you try Map Clustering on Labels and then the performance operators?
- Sr. Director Data Solutions, Altair RapidMiner -
Dortmund, Germany0
Answers
If you have labeled data, most of the time clustering is bring owls to Athens....
Of course you can use 'set role' to make lable column to normal regular attributes and pretend to not have any label information. Use the data without special attribute 'label' you can do any clustering you want.
Hope that makes senses...
I know the purpose of clustering, but I want to compare the found clusters with labeled "clusters" if you know what I mean, to find the "goodness" of clusters by comparing them with some ground truth...
any sophisticated way to do so? any ideas?
yeah thanks, that seemed to work, but I still don't know how that operator works,
how is it choosing which cluster is what label?
Mh, good question. The important code is in ClusterToPrediction.java - but it's quite a chunk.
Dortmund, Germany
Hi, how should I use this code in the program? Where should I copy and use?
Thankful
Sorry i'm asking
hi
sorry
please help me
thanks
one further question in this connection. Which classification model does the "Map Clustering on Labels" operator consider with regard to the subsequent calculation of performance values?
Thank you in advance for your response!
Best regards!
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts