Extracting Classification from Auto Model

gianluca_scheidgianluca_scheid Member Posts: 10 Contributor I
edited July 2019 in Help
Hi there

I used the auto model feature for the first time for a cluster analysis.
Now, I'd like to add a column to my data set indicating the cluster the specific company belongs to.

How do I extract the classification and add it to my data set?

Thank you in advance

Best regards
GL

Best Answer

  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Solution Accepted
    Ah, got it.  Well, the order of the rows is the same.  So you simply build a simple process to merge it back to the original data.  Alternatively, and even simpler if you are not familiar with process design in RapidMiner, you can
    1. export the clustered data to Turbo Prep,
    2. add the original data set as a second data set to Turbo Prep as well,
    3. click on Merge and select an Inner Join and activate the checkbox for Use Row Number as Key.
    4. Update, Commit, Done.
    Hope this helps,
    Ingo

Answers

  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    edited March 2019
    Well, this clustering column actually has already been added to your table (see screenshot below).  You will find it under the Clustered Data item.
    Or do you want to build a new model on top of this data which assigns new data points to the found clusters?  In this case, you need to export the data (store it in a repository) and run Auto Model again on that exported data set, but this time with Prediction on the new cluster column.  The output will be a model which can assign new data points to the found clusters.
    Little trick: you can also "export" to Turbo Prep and click on Model there again to get back into Auto Model on the same data without actually writing the data yourself into a repo.
    Hope that helps,
    Ingo

  • gianluca_scheidgianluca_scheid Member Posts: 10 Contributor I
    Dear Ingo

    Thank you for your reply!

    Maybe I have to be more specific about my problem:

    I have a data set with the name of retailers and two different KPIs. Now, I would like to cluster those retailers based on the two KPIs into four different groups. After that, I would like to add the cluster information (cluster 1, cluster 2...) to my original data set. Unfortunately, the "Clustered Data" table does not contain any retailer information anymore. How can I "merge" the "Clustered Data" table with my original data set?

    Best regards
    Gianluca

  • gianluca_scheidgianluca_scheid Member Posts: 10 Contributor I
    Great! I wasn't sure if the row number remains the same for the clustered data :)

    Thank you!
Sign In or Register to comment.