Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
Clustering output
Hi, I've searched the forums for an answer to this question but I haven't found one yet.
I have a csv file which has 4 columns:
1) Id (used as a label)
2) Clustering attribute 1
3) Clustering attribute 2
4) Other information
The clustering attributes are the ones used in the actual clustering, while the other information is not used.
Currently, I have used the select attributes function to ensure that the clustering attributes are used for the clustering. However, in the write csv output of the clustering, I get only the following columns:
1) Id
2) Clustering attribute 1
3) Clustering attribute 2
4) Cluster that the ID belongs to
However, I need the 4th column (Other Information) that existed in the original dataset in the clustering output (even though its not used). Is there any way to do this?
Thanks very much in advance!
I have a csv file which has 4 columns:
1) Id (used as a label)
2) Clustering attribute 1
3) Clustering attribute 2
4) Other information
The clustering attributes are the ones used in the actual clustering, while the other information is not used.
Currently, I have used the select attributes function to ensure that the clustering attributes are used for the clustering. However, in the write csv output of the clustering, I get only the following columns:
1) Id
2) Clustering attribute 1
3) Clustering attribute 2
4) Cluster that the ID belongs to
However, I need the 4th column (Other Information) that existed in the original dataset in the clustering output (even though its not used). Is there any way to do this?
Thanks very much in advance!
Tagged:
0
Answers
I'm guessing that the clustering operator is not being passed the 4th column hence the output cannot have it either. To fix this, set the role of the 4th column (use Set Role) to be a type of your own free text choosing - say "other". Do this before the clustering operator and this will allow the attribute to pass through unnoticed. You should then see the attribute in the output. The basic point is that most operators work on regular attributes only.
Andrew