**RapidMiner 9.7 is Now Available**

### Lots of amazing new improvements including true version control! Learn more about what's new here.

### CLICK HERE TO DOWNLOAD

# "How to set up multiple regression analysis?"

Hello everyone. I'm very new to RM, and got here when I was looking for tools to help me run statistical inference tests on some data. I've searched through previous posts on this topic but haven't quite found an answer.

Assume I have one "dependent" variable -- some measurement of interest, such as "age when death occurs". Call this variable Y.

Assume I have 3 "independent" variables, that could be measures of "unhealthy" factors: X1=number of pounds overweight; X2=cholesterol level; X3=triglyceride level. Assume I have 100 (deceased) people with measurements on all this variables.

I want to run a multiple regression analysis http://en.wikipedia.org/wiki/Regression_analysis to see how well the combination of 3 independent variables "predicts" the dependent variable, and end up with a value for F (and its significance level), a value for R squared, and the regression coefficients.

I see the Linear Regression and Vector Linear Regression operators. I also see the ANOVA operator, and the ANOVA Matrix and Grouped ANOVA operators. (I don't know how to obtain a Performance Vector, which is required for the ANOVA operator.) But I don't see how to get what I need using some combination of these and other operators. Is RM not intended for this kind of inferential significance testing?

Different but related question: Assume I have one dependent variable Y, 2 independent variables A & B, and a 2x2 experimental design: Can RM calculate the 3 ANOVA F values to indicate whether variable A, and/or B, and or their AxB interaction, have statistically significant effects on Y?

Thanks for any and all tips!

Assume I have one "dependent" variable -- some measurement of interest, such as "age when death occurs". Call this variable Y.

Assume I have 3 "independent" variables, that could be measures of "unhealthy" factors: X1=number of pounds overweight; X2=cholesterol level; X3=triglyceride level. Assume I have 100 (deceased) people with measurements on all this variables.

I want to run a multiple regression analysis http://en.wikipedia.org/wiki/Regression_analysis to see how well the combination of 3 independent variables "predicts" the dependent variable, and end up with a value for F (and its significance level), a value for R squared, and the regression coefficients.

I see the Linear Regression and Vector Linear Regression operators. I also see the ANOVA operator, and the ANOVA Matrix and Grouped ANOVA operators. (I don't know how to obtain a Performance Vector, which is required for the ANOVA operator.) But I don't see how to get what I need using some combination of these and other operators. Is RM not intended for this kind of inferential significance testing?

Different but related question: Assume I have one dependent variable Y, 2 independent variables A & B, and a 2x2 experimental design: Can RM calculate the 3 ANOVA F values to indicate whether variable A, and/or B, and or their AxB interaction, have statistically significant effects on Y?

Thanks for any and all tips!

Tagged:

0