Options

# Validating a linear regression

I figure the capabilities I'm looking for must be available - I just haven't been able to find them.

When generating a Linear Regression in RapidMiner v5 (.008 - the upgrade to .015 isn't working for me), I am trying to figure out how to get the various measures and plots that are used to validate the various assumptions of a Linear Regression. With the standard output of the Linear Regression operator, I can find the R Square and T-test results for the individual variables. I can use the T-test results to imply the model level F-test.

Additional information I am looking for are things like the Adjusted R-Square, plot of errors, QQ plot, Variance Inflation Factor, Cooke's distance, and that sort of thing. I originally learned validation of linear regression using PROC REG from SAS if that helps frame the sort of information I'm looking for.

I figure these tests and plots have to be available in Rapid Miner - any hints or pointers to where I can get that info is greatly appreciated.

Thanks

When generating a Linear Regression in RapidMiner v5 (.008 - the upgrade to .015 isn't working for me), I am trying to figure out how to get the various measures and plots that are used to validate the various assumptions of a Linear Regression. With the standard output of the Linear Regression operator, I can find the R Square and T-test results for the individual variables. I can use the T-test results to imply the model level F-test.

Additional information I am looking for are things like the Adjusted R-Square, plot of errors, QQ plot, Variance Inflation Factor, Cooke's distance, and that sort of thing. I originally learned validation of linear regression using PROC REG from SAS if that helps frame the sort of information I'm looking for.

I figure these tests and plots have to be available in Rapid Miner - any hints or pointers to where I can get that info is greatly appreciated.

Thanks

Tagged:

0

## Answers

14Contributor IIThanks!

1,869Unicornusually we use a X-Validation to validate the Linear Regression - the same way as we do with all supervised learning algorithms.

Basically the X-Validation splits the data numerous times into test and training set, calculates the linear regression model on the training set, applies it on the test set and calculates a performance measure.

By using the operator Performance (Regression) you have a big choice of measures to calculate.

Best regards,

Marius

14Contributor IIThanks