"Correlation Matrix Operator - which kind of correlation coefficient?"

South2woodSouth2wood Member Posts: 5 Contributor II
edited June 2019 in Help

i was wondering which kind of correlation coefficient is calculated by the "Correlation Matrix Operator"?

Since I have a wide variety of factors and factor distributions (many of which are not Gaussian), the Spearman correlation is preferred to the more familiar Pearson correlation (as Pearson correlation assumes the variables are Gaussian).

Best regards,



  • homburghomburg Moderator, Employee, Member Posts: 114 RM Data Scientist
    Hi Matthias,

    the operator "Correlation Matrix" uses the Pearson correlation coefficient. If you think other coefficients like Spearman should be part of the operator in the next release please use the feature request tracker (http://bugs.rapid-i.com/) and leave a short note.

  • sharpersharper Member Posts: 1 Contributor I
    Hi there,

    My question is a follow-up on this issue so I am replying here instead of opening a new post.
    You mention the Correlation Matrix operator uses the Pearson correlation coefficient.
    I have built a Correlation matrix for data which has a mixture of Nominal and Numeric variables,
    and I seem to be getting a very decent matrix. For numerical pairs I assume it is using Pearson like you said, but what
    about the other pairs:
    1. How are the correlation coefficients for numeric vs. nominal (non-dichotomous) calculated? (is some specific point multiserial coefficient used?!)
    2. How are the correlation coefficients for nominal (non-dichotomous) vs. nominal (non-dichotomous) calculated? (does it use Pearson's C, Kramer;'s V!?)

    It would be very helpful to know this, and any relevent references.

    Thanks in advance,

  • TroddelTroddel Member Posts: 1 Contributor I
    I'm interested in Sharper's questions as well.Β  ???

  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Hi all,

    out of hands I am not completely sure, but I think it simply assigns numbers to the nominal values in the order of appearance (same way as Nominal to Numerical with unique_integers coding does), and the calculates Spearman's Correlation coefficient.

    Best regards,
Sign In or Register to comment.