Options

# [SOLVED] Unexpected output for linear regression operator

dysprosium
Member Posts:

**7**Contributor II
I'm quite new to Rapid Miner and trying the linear regression for the first time. I’m applying the Linear Regression operator on a training data set, then outputting a regression model which is input to the Apply Model operator. Then I will apply the Model to an unlabelled data set.

There is one special attribute (label) and 3 regular attributes in the training data set. It has only 7 examples at the moment (easier for me to see what’s happening). The attributes are all integers.

In the attribute weights output from the linear regression, I’m expecting all of the 3 regular attributes to have a weight greater than zero. However, when I run the process only one attribute has a weight greater than 0. Its weight is 0.268. The other two attributes have a weight of 0. It seems as if the linear regression operator is ignoring those two attributes. Why?

The reason I expect all of the weights from the Linear Regression to be non-zero is because when I input

There is one special attribute (label) and 3 regular attributes in the training data set. It has only 7 examples at the moment (easier for me to see what’s happening). The attributes are all integers.

In the attribute weights output from the linear regression, I’m expecting all of the 3 regular attributes to have a weight greater than zero. However, when I run the process only one attribute has a weight greater than 0. Its weight is 0.268. The other two attributes have a weight of 0. It seems as if the linear regression operator is ignoring those two attributes. Why?

The reason I expect all of the weights from the Linear Regression to be non-zero is because when I input

*exactly the same training set*to the Vector Linear Regression operator, I get either positive or negative weights for all three regular attributes.

<process version="6.0.002">

<context>

<input/>

<output/>

<macros/>

</context>

<operator activated="true" class="process" compatibility="6.0.002" expanded="true" name="Process">

<process expanded="true">

<operator activated="true" breakpoints="after" class="retrieve" compatibility="6.0.002" expanded="true" height="60" name="Retrieve regression data Oct 13 training set 1 special (label) attribute 3 regular attributes" width="90" x="112" y="75">

<parameter key="repository_entry" value="../data/regression data Oct 13 training set 1 special (label) attribute 3 regular attributes"/>

</operator>

<operator activated="true" breakpoints="after" class="linear_regression" compatibility="6.0.002" expanded="true" height="94" name="Linear Regression" width="90" x="313" y="30"/>

<operator activated="true" breakpoints="after" class="retrieve" compatibility="6.0.002" expanded="true" height="60" name="Retrieve regression data Oct 13 UNLABELLED set 1 special (label) attribute 3 regular attributes" width="90" x="112" y="255">

<parameter key="repository_entry" value="../data/regression data Oct 13 UNLABELLED set 1 special (label) attribute 3 regular attributes"/>

</operator>

<operator activated="true" breakpoints="after" class="apply_model" compatibility="6.0.002" expanded="true" height="76" name="Apply Model" width="90" x="514" y="165">

<list key="application_parameters"/>

</operator>

<connect from_op="Retrieve regression data Oct 13 training set 1 special (label) attribute 3 regular attributes" from_port="output" to_op="Linear Regression" to_port="training set"/>

<connect from_op="Linear Regression" from_port="model" to_op="Apply Model" to_port="model"/>

<connect from_op="Linear Regression" from_port="weights" to_port="result 2"/>

<connect from_op="Retrieve regression data Oct 13 UNLABELLED set 1 special (label) attribute 3 regular attributes" from_port="output" to_op="Apply Model" to_port="unlabelled data"/>

<connect from_op="Apply Model" from_port="labelled data" to_port="result 1"/>

<portSpacing port="source_input 1" spacing="0"/>

<portSpacing port="sink_result 1" spacing="0"/>

<portSpacing port="sink_result 2" spacing="0"/>

<portSpacing port="sink_result 3" spacing="0"/>

</process>

</operator>

</process>

Tagged:

0

## Answers

114RM Data Scientistthe Linear Regression in RapidMiner offers a few built-in features such as feature selection or colinear feature elimination. Please set feature selection to none (the default is M5 prime) and disable the "eliminate colinear features" check box. No the algorithm shall use all of your three attributes.

Cheers,

Helge

7Contributor III corrected the feature settings and now getting weights for all the attributes.

Just one more question .... the results I get (for this data set) from the linear regression operator are exactly the same (except formatted differently) as the results from vector linear regression. How are the two algorithms different?

Thanks!

Dy

114RM Data Scientistthe algorithms only differ in those feature selection options you just disabled. The vector version does the same job as the linear regression but for a vector label (in this case serveral numerical attributes). If you input one label you will receive more or less the same model.

Cheers,

Helge

7Contributor IICheers,

Dy