Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

# "Correlation Matrix Operator - which kind of correlation coefficient?"

Member Posts: 5 Contributor II
edited June 2019 in Help
Hi,

i was wondering which kind of correlation coefficient is calculated by the "Correlation Matrix Operator"?

Since I have a wide variety of factors and factor distributions (many of which are not Gaussian), the Spearman correlation is preferred to the more familiar Pearson correlation (as Pearson correlation assumes the variables are Gaussian).

Best regards,

Matthias
Tagged:

• Employee, Member Posts: 114 RM Data Scientist
Hi Matthias,

the operator "Correlation Matrix" uses the Pearson correlation coefficient. If you think other coefficients like Spearman should be part of the operator in the next release please use the feature request tracker (http://bugs.rapid-i.com/) and leave a short note.

Cheers,
Helge
• Member Posts: 1 Learner III
Hi there,

My question is a follow-up on this issue so I am replying here instead of opening a new post.
You mention the Correlation Matrix operator uses the Pearson correlation coefficient.
I have built a Correlation matrix for data which has a mixture of Nominal and Numeric variables,
and I seem to be getting a very decent matrix. For numerical pairs I assume it is using Pearson like you said, but what
1. How are the correlation coefficients for numeric vs. nominal (non-dichotomous) calculated? (is some specific point multiserial coefficient used?!)
2. How are the correlation coefficients for nominal (non-dichotomous) vs. nominal (non-dichotomous) calculated? (does it use Pearson's C, Kramer;'s V!?)

It would be very helpful to know this, and any relevent references.

Sharper
• Member Posts: 1 Learner III
I'm interested in Sharper's questions as well.  ???

Best,
Troddel
• RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
Hi all,

out of hands I am not completely sure, but I think it simply assigns numbers to the nominal values in the order of appearance (same way as Nominal to Numerical with unique_integers coding does), and the calculates Spearman's Correlation coefficient.

Best regards,
Marius