Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

Correlation Matrix how to group correlated attributes together?

Fred12Fred12 Member Posts: 344 Unicorn
edited November 2018 in Help

hi,

I would like to make groups of attributes that are similar or that correlate... how can I best do that?

do I use some kind of binning techniques for all attributes? and do all attributes get the same binning range (e.g all attributes together or should I do it separate for each attribute) according to some proportionaliy measure?

sorry I dont know where to start...

Tagged:

Answers

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    What you can try is to use the Weight by Correlation operator and then use a binning operator to group your attributes. 

     

    Maybe something like this?

     

    <?xml version="1.0" encoding="UTF-8"?><process version="7.2.002">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="6.0.002" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="retrieve" compatibility="7.2.002" expanded="true" height="68" name="Polynomial" width="90" x="45" y="34">
    <parameter key="repository_entry" value="//Samples/data/Polynomial"/>
    </operator>
    <operator activated="true" class="weight_by_correlation" compatibility="7.2.002" expanded="true" height="82" name="Weight by Correlation" width="90" x="179" y="34"/>
    <operator activated="true" class="weights_to_data" compatibility="7.2.002" expanded="true" height="68" name="Weights to Data" width="90" x="313" y="34"/>
    <operator activated="true" class="discretize_by_bins" compatibility="7.2.002" expanded="true" height="103" name="Discretize" width="90" x="447" y="34">
    <parameter key="number_of_bins" value="3"/>
    </operator>
    <connect from_op="Polynomial" from_port="output" to_op="Weight by Correlation" to_port="example set"/>
    <connect from_op="Weight by Correlation" from_port="weights" to_op="Weights to Data" to_port="attribute weights"/>
    <connect from_op="Weights to Data" from_port="example set" to_op="Discretize" to_port="example set input"/>
    <connect from_op="Discretize" from_port="example set output" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="90"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>
  • Fred12Fred12 Member Posts: 344 Unicorn

    yeah something like this, but how can I do this in regard to class prediction? I mean I have only the columns Attribute and Weight left after doing this...??

  • Fred12Fred12 Member Posts: 344 Unicorn

    any Idea somebody how to do this automatically (binning) with the original dataset?

    this way it gives me nice ranges, but I would have to set them manually for each attribute I think...?

Sign In or Register to comment.