🎉 🎉 RAPIDMINER 9.10 IS OUT!!! 🎉🎉

Download the latest version helping analytics teams accelerate time-to-value for streaming and IIOT use cases.


"Can I and how do I use a correlation matrix for categorical variables?"

serafinaserafina Member Posts: 1 Contributor I
edited June 2019 in Help

Hello everyone,

I'm new to RapidMiner, so I apologise in advance for all the silly questions that I ask.

For a project that I am doing for uni, I have a dataset that contains both categorical and numerical variables. We are supposed to choose Predictors to predict our label "recommended" which is a binominal variable.

First of all, in addition to the >0.5 corrleation rule, can I choose my predictors based on the attribute weights in the AttributeWeight Table? How do I interpret this weight table? Why are the values contradicting with the correlation values?

Second, can I use categorical variables for my correlation matrix? If I can, how do I transform my categorical variables into dummy variables so that I can use them in the matrix? I know about the Nominal to Numerical Operator but I am not sure if that is the correct way to go because I am getting only negative correlations! (thats 14 attributes negatively correlated to Recommended) Is that normal?


Thanks a TON.



  • Pavithra_RaoPavithra_Rao Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 123  RM Data Scientist

    Hi @serafina,


    Could you please post the process XML file and the sample dataset here to get a better understanding of the question here?





Sign In or Register to comment.