"Correlation Matrix"

simsim Member Posts: 18 Learner I
edited May 23 in Help
I am trying to conduct a correlation matrix on some data. However the results do not include a correlation matrix, but rather a table with two columns where all of the attributes only in one column. I have used the "nominal to binomial", "correlation matrix" and "select weights" operators. 
Do you know what I am doing wrong?
Tagged:

Best Answer

Answers

  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,226   Unicorn
    If you can post your XML it would be easier to troubleshoot :-)
    But from your description, it sounds like you might be using Weight by Correlation, which only looks at the correlation between attributes and the defined label.  If you want the full correlation matrix you need to use the Correlation Matrix operator instead.

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • simsim Member Posts: 18 Learner I
    Hi Telcontar120, 
    Thank you for such a quick response! I have removed the "select weights" attribute, but am still facing the same error. I would upload the XML file, but don't know how to (I'm new to rapidminer), sorry!

    Do you know if there's anything else that I can try?

  • simsim Member Posts: 18 Learner I
    Thank you mschmitz!!! That definitely helped!! I now have my results in the form of a correlation table.
    All of the categories within my attributes are now listed as individual attributes, is there anyway for this to be adjusted? 
    Thank you once again! 
  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 2,111  RM Data Scientist
    Hi,
    pearson correlation is not defined for nominal types. Thus they can't be in.

    BR,
    Martin
    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • simsim Member Posts: 18 Learner I
    is there an operator than can be used to convert the data so it can be included?
  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 2,111  RM Data Scientist
    Well, i would take a measure which can handle this. i.e Weight by Gini Index.
    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • simsim Member Posts: 18 Learner I
    does the weight by ginni index convert the data?
  • simsim Member Posts: 18 Learner I
    Hi Weight by Ginni Index did not work for me- is there anything else that I can use?
  • simsim Member Posts: 18 Learner I
    edited January 2
    Hi mschmitz, hope you're well!
    Just wondering if there was an update?
  • simsim Member Posts: 18 Learner I
    Hi Martin, 

    Just wondering if you've seen my above message?

  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 2,111  RM Data Scientist
    Hi @sim ,
    i would go for something like the attached one. but please keep in mind that this is only defined for not necesarrly normalized w.r.t correlation.
    <?xml version="1.0" encoding="UTF-8"?><process version="9.1.000">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="9.1.000" expanded="true" name="Process">
    <parameter key="logverbosity" value="init"/>
    <parameter key="random_seed" value="2001"/>
    <parameter key="send_mail" value="never"/>
    <parameter key="notification_email" value=""/>
    <parameter key="process_duration_for_mail" value="30"/>
    <parameter key="encoding" value="SYSTEM"/>
    <process expanded="true">
    <operator activated="true" class="retrieve" compatibility="9.1.000" expanded="true" height="68" name="Retrieve Golf" width="90" x="179" y="85">
    <parameter key="repository_entry" value="//Samples/data/Golf"/>
    </operator>
    <operator activated="true" class="select_attributes" compatibility="9.1.000" expanded="true" height="82" name="Select Attributes" width="90" x="313" y="85">
    <parameter key="attribute_filter_type" value="value_type"/>
    <parameter key="attribute" value=""/>
    <parameter key="attributes" value=""/>
    <parameter key="use_except_expression" value="false"/>
    <parameter key="value_type" value="nominal"/>
    <parameter key="use_value_type_exception" value="false"/>
    <parameter key="except_value_type" value="time"/>
    <parameter key="block_type" value="attribute_block"/>
    <parameter key="use_block_type_exception" value="false"/>
    <parameter key="except_block_type" value="value_matrix_row_start"/>
    <parameter key="invert_selection" value="false"/>
    <parameter key="include_special_attributes" value="false"/>
    </operator>
    <operator activated="true" class="set_role" compatibility="9.1.000" expanded="true" height="82" name="Set Role" width="90" x="514" y="85">
    <parameter key="attribute_name" value="Play"/>
    <parameter key="target_role" value="regular"/>
    <list key="set_additional_roles"/>
    </operator>
    <operator activated="true" class="concurrency:loop_attributes" compatibility="9.1.000" expanded="true" height="82" name="Loop Attributes" width="90" x="715" y="85">
    <parameter key="attribute_filter_type" value="all"/>
    <parameter key="attribute" value=""/>
    <parameter key="attributes" value=""/>
    <parameter key="use_except_expression" value="false"/>
    <parameter key="value_type" value="attribute_value"/>
    <parameter key="use_value_type_exception" value="false"/>
    <parameter key="except_value_type" value="time"/>
    <parameter key="block_type" value="attribute_block"/>
    <parameter key="use_block_type_exception" value="false"/>
    <parameter key="except_block_type" value="value_matrix_row_start"/>
    <parameter key="invert_selection" value="false"/>
    <parameter key="include_special_attributes" value="false"/>
    <parameter key="attribute_name_macro" value="loop_attribute"/>
    <parameter key="reuse_results" value="false"/>
    <parameter key="enable_parallel_execution" value="true"/>
    <process expanded="true">
    <operator activated="true" class="set_role" compatibility="9.1.000" expanded="true" height="82" name="Set Role (2)" width="90" x="112" y="34">
    <parameter key="attribute_name" value="%{loop_attribute}"/>
    <parameter key="target_role" value="label"/>
    <list key="set_additional_roles"/>
    </operator>
    <operator activated="true" class="weight_by_information_gain" compatibility="9.1.000" expanded="true" height="82" name="Weight by Information Gain" width="90" x="380" y="34">
    <parameter key="normalize_weights" value="false"/>
    <parameter key="sort_weights" value="true"/>
    <parameter key="sort_direction" value="ascending"/>
    </operator>
    <operator activated="false" class="weight_by_gini_index" compatibility="9.1.000" expanded="true" height="82" name="Weight by Gini Index" width="90" x="380" y="136">
    <parameter key="normalize_weights" value="false"/>
    <parameter key="sort_weights" value="true"/>
    <parameter key="sort_direction" value="ascending"/>
    </operator>
    <operator activated="true" class="weights_to_data" compatibility="9.1.000" expanded="true" height="68" name="Weights to Data" width="90" x="715" y="34"/>
    <operator activated="true" class="generate_attributes" compatibility="9.1.000" expanded="true" height="82" name="Generate Attributes" width="90" x="916" y="34">
    <list key="function_descriptions">
    <parameter key="label" value="%{loop_attribute}"/>
    </list>
    <parameter key="keep_all" value="true"/>
    </operator>
    <connect from_port="input 1" to_op="Set Role (2)" to_port="example set input"/>
    <connect from_op="Set Role (2)" from_port="example set output" to_op="Weight by Information Gain" to_port="example set"/>
    <connect from_op="Weight by Information Gain" from_port="weights" to_op="Weights to Data" to_port="attribute weights"/>
    <connect from_op="Weights to Data" from_port="example set" to_op="Generate Attributes" to_port="example set input"/>
    <connect from_op="Generate Attributes" from_port="example set output" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="append" compatibility="9.1.000" expanded="true" height="82" name="Append" width="90" x="916" y="85">
    <parameter key="datamanagement" value="double_array"/>
    <parameter key="data_management" value="auto"/>
    <parameter key="merge_type" value="all"/>
    </operator>
    <operator activated="true" class="blending:pivot" compatibility="9.1.000" expanded="true" height="82" name="Pivot" width="90" x="1050" y="85">
    <parameter key="group_by_attributes" value="label"/>
    <parameter key="column_grouping_attribute" value="Attribute"/>
    <list key="aggregation_attributes">
    <parameter key="Weight" value="average"/>
    </list>
    <parameter key="use_default_aggregation" value="false"/>
    <parameter key="default_aggregation_function" value="first"/>
    </operator>
    <connect from_op="Retrieve Golf" from_port="output" to_op="Select Attributes" to_port="example set input"/>
    <connect from_op="Select Attributes" from_port="example set output" to_op="Set Role" to_port="example set input"/>
    <connect from_op="Set Role" from_port="example set output" to_op="Loop Attributes" to_port="input 1"/>
    <connect from_op="Loop Attributes" from_port="output 1" to_op="Append" to_port="example set 1"/>
    <connect from_op="Append" from_port="merged set" to_op="Pivot" to_port="input"/>
    <connect from_op="Pivot" from_port="output" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>



    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
    David_A
Sign In or Register to comment.