Options

Correlation Matrix how to group correlated attributes together?

Fred12Fred12 Member Posts: 344 Unicorn
edited November 2018 in Help

hi,

I would like to make groups of attributes that are similar or that correlate... how can I best do that?

do I use some kind of binning techniques for all attributes? and do all attributes get the same binning range (e.g all attributes together or should I do it separate for each attribute) according to some proportionaliy measure?

sorry I dont know where to start...

Tagged:

Answers

  • Options
    Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    What you can try is to use the Weight by Correlation operator and then use a binning operator to group your attributes. 

     

    Maybe something like this?

     

    <?xml version="1.0" encoding="UTF-8"?><process version="7.2.002">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="6.0.002" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="retrieve" compatibility="7.2.002" expanded="true" height="68" name="Polynomial" width="90" x="45" y="34">
    <parameter key="repository_entry" value="//Samples/data/Polynomial"/>
    </operator>
    <operator activated="true" class="weight_by_correlation" compatibility="7.2.002" expanded="true" height="82" name="Weight by Correlation" width="90" x="179" y="34"/>
    <operator activated="true" class="weights_to_data" compatibility="7.2.002" expanded="true" height="68" name="Weights to Data" width="90" x="313" y="34"/>
    <operator activated="true" class="discretize_by_bins" compatibility="7.2.002" expanded="true" height="103" name="Discretize" width="90" x="447" y="34">
    <parameter key="number_of_bins" value="3"/>
    </operator>
    <connect from_op="Polynomial" from_port="output" to_op="Weight by Correlation" to_port="example set"/>
    <connect from_op="Weight by Correlation" from_port="weights" to_op="Weights to Data" to_port="attribute weights"/>
    <connect from_op="Weights to Data" from_port="example set" to_op="Discretize" to_port="example set input"/>
    <connect from_op="Discretize" from_port="example set output" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="90"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>
  • Options
    Fred12Fred12 Member Posts: 344 Unicorn

    yeah something like this, but how can I do this in regard to class prediction? I mean I have only the columns Attribute and Weight left after doing this...??

  • Options
    Fred12Fred12 Member Posts: 344 Unicorn

    any Idea somebody how to do this automatically (binning) with the original dataset?

    this way it gives me nice ranges, but I would have to set them manually for each attribute I think...?

Sign In or Register to comment.