Options

Clustering output

rohanm93rohanm93 Member Posts: 1 Contributor I
edited December 2019 in Help
Hi, I've searched the forums for an answer to this question but I haven't found one yet.

I have a csv file which has 4 columns:
1) Id (used as a label)
2) Clustering attribute 1
3) Clustering attribute 2
4) Other information

The clustering attributes are the ones used in the actual clustering, while the other information is not used.

Currently, I have used the select attributes function to ensure that the clustering attributes are used for the clustering. However, in the write csv output of the clustering, I get only the following columns:
1) Id
2) Clustering attribute 1
3) Clustering attribute 2
4) Cluster that the ID belongs to

However, I need the 4th column (Other Information) that existed in the original dataset in the clustering output (even though its not used). Is there any way to do this?

Thanks very much in advance! :)
Tagged:

Answers

  • Options
    awchisholmawchisholm RapidMiner Certified Expert, Member Posts: 458 Unicorn
    Hello rohanm93,

    I'm guessing that the clustering operator is not being passed the 4th column hence the output cannot have it either. To fix this, set the role of the 4th column (use Set Role) to be a type of your own free text choosing - say "other". Do this before the clustering operator and this will allow the attribute to pass through unnoticed. You should then see the attribute in the output. The basic point is that most operators work on regular attributes only.

    Andrew
Sign In or Register to comment.