RapidMiner 9.7 is Now Available

Lots of amazing new improvements including true version control! Learn more about what's new here.


"Performance operator delivers correlation 000 if correlation is negative"

qwertzqwertz Member Posts: 130  Maven
edited June 2019 in Help
Hi RM community,

I was looking for a way to get the correlation value out of the correlation matrix. As this doesn't seem to be feasible (in the community version without reports as described in the forum) I tried to capture correlation from the "performance (regression)" operator.

I discovered that the performance operator delivers 0.000 as correlation in case that correlation is negative.

To reproduce this behaviour run the attached process. Look at the result of the correlation matrix. "att1" and "att" have a negative correlation to "label".
Then change the "name" parameter of the "set role 2" operator to an arbitrary attribute and compare the result of the performance operator to the correlation matrix.
In case that "att1" or "att5" is being selected the performance operator will show 0.000

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.003">
  <operator activated="true" class="process" compatibility="5.0.000" expanded="true" name="Root">
    <process expanded="true" height="494" width="567">
      <operator activated="true" class="generate_data" compatibility="5.2.003" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30"/>
      <operator activated="true" class="multiply" compatibility="5.2.003" expanded="true" height="94" name="Multiply" width="90" x="179" y="30"/>
      <operator activated="true" class="set_role" compatibility="5.2.003" expanded="true" height="76" name="Set Role" width="90" x="313" y="165">
        <parameter key="name" value="label"/>
        <list key="set_additional_roles"/>
      <operator activated="true" class="correlation_matrix" compatibility="5.2.003" expanded="true" height="94" name="Correlation Matrix" width="90" x="447" y="165"/>
      <operator activated="true" class="set_role" compatibility="5.2.003" expanded="true" height="76" name="Set Role (2)" width="90" x="313" y="30">
        <parameter key="name" value="att2"/>
        <parameter key="target_role" value="prediction"/>
        <list key="set_additional_roles"/>
      <operator activated="true" class="performance_regression" compatibility="5.2.003" expanded="true" height="76" name="Performance" width="90" x="447" y="30">
        <parameter key="main_criterion" value="correlation"/>
        <parameter key="root_mean_squared_error" value="false"/>
        <parameter key="correlation" value="true"/>
        <parameter key="skip_undefined_labels" value="false"/>
        <parameter key="use_example_weights" value="false"/>
      <connect from_op="Generate Data" from_port="output" to_op="Multiply" to_port="input"/>
      <connect from_op="Multiply" from_port="output 1" to_op="Set Role (2)" to_port="example set input"/>
      <connect from_op="Multiply" from_port="output 2" to_op="Set Role" to_port="example set input"/>
      <connect from_op="Set Role" from_port="example set output" to_op="Correlation Matrix" to_port="example set"/>
      <connect from_op="Correlation Matrix" from_port="matrix" to_port="result 2"/>
      <connect from_op="Correlation Matrix" from_port="weights" to_port="result 3"/>
      <connect from_op="Set Role (2)" from_port="example set output" to_op="Performance" to_port="labelled data"/>
      <connect from_op="Performance" from_port="performance" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="144"/>
      <portSpacing port="sink_result 3" spacing="0"/>
      <portSpacing port="sink_result 4" spacing="0"/>

Kind regards
Sign In or Register to comment.