Clustering output

rohanm93 · May 2014

Hi, I've searched the forums for an answer to this question but I haven't found one yet.

I have a csv file which has 4 columns:
1) Id (used as a label)
2) Clustering attribute 1
3) Clustering attribute 2
4) Other information

The clustering attributes are the ones used in the actual clustering, while the other information is not used.

Currently, I have used the select attributes function to ensure that the clustering attributes are used for the clustering. However, in the write csv output of the clustering, I get only the following columns:
1) Id
2) Clustering attribute 1
3) Clustering attribute 2
4) Cluster that the ID belongs to

However, I need the 4th column (Other Information) that existed in the original dataset in the clustering output (even though its not used). Is there any way to do this?

Thanks very much in advance!

awchisholm · May 2014

Hello rohanm93,

I'm guessing that the clustering operator is not being passed the 4th column hence the output cannot have it either. To fix this, set the role of the 4th column (use Set Role) to be a type of your own free text choosing - say "other". Do this before the clustering operator and this will allow the attribute to pass through unnoticed. You should then see the attribute in the output. The basic point is that most operators work on regular attributes only.

Andrew

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

Clustering output

Answers