Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
Validating a linear regression
I figure the capabilities I'm looking for must be available - I just haven't been able to find them.
When generating a Linear Regression in RapidMiner v5 (.008 - the upgrade to .015 isn't working for me), I am trying to figure out how to get the various measures and plots that are used to validate the various assumptions of a Linear Regression. With the standard output of the Linear Regression operator, I can find the R Square and T-test results for the individual variables. I can use the T-test results to imply the model level F-test.
Additional information I am looking for are things like the Adjusted R-Square, plot of errors, QQ plot, Variance Inflation Factor, Cooke's distance, and that sort of thing. I originally learned validation of linear regression using PROC REG from SAS if that helps frame the sort of information I'm looking for.
I figure these tests and plots have to be available in Rapid Miner - any hints or pointers to where I can get that info is greatly appreciated.
Thanks
When generating a Linear Regression in RapidMiner v5 (.008 - the upgrade to .015 isn't working for me), I am trying to figure out how to get the various measures and plots that are used to validate the various assumptions of a Linear Regression. With the standard output of the Linear Regression operator, I can find the R Square and T-test results for the individual variables. I can use the T-test results to imply the model level F-test.
Additional information I am looking for are things like the Adjusted R-Square, plot of errors, QQ plot, Variance Inflation Factor, Cooke's distance, and that sort of thing. I originally learned validation of linear regression using PROC REG from SAS if that helps frame the sort of information I'm looking for.
I figure these tests and plots have to be available in Rapid Miner - any hints or pointers to where I can get that info is greatly appreciated.
Thanks
Tagged:
0
Answers
Thanks!
usually we use a X-Validation to validate the Linear Regression - the same way as we do with all supervised learning algorithms.
Basically the X-Validation splits the data numerous times into test and training set, calculates the linear regression model on the training set, applies it on the test set and calculates a performance measure.
By using the operator Performance (Regression) you have a big choice of measures to calculate.
Best regards,
Marius
Thanks