RapidMiner 9.7 is Now Available
Lots of amazing new improvements including true version control! Learn more about what's new here.
"How to set up multiple regression analysis?"
Assume I have one "dependent" variable -- some measurement of interest, such as "age when death occurs". Call this variable Y.
Assume I have 3 "independent" variables, that could be measures of "unhealthy" factors: X1=number of pounds overweight; X2=cholesterol level; X3=triglyceride level. Assume I have 100 (deceased) people with measurements on all this variables.
I want to run a multiple regression analysis http://en.wikipedia.org/wiki/Regression_analysis to see how well the combination of 3 independent variables "predicts" the dependent variable, and end up with a value for F (and its significance level), a value for R squared, and the regression coefficients.
I see the Linear Regression and Vector Linear Regression operators. I also see the ANOVA operator, and the ANOVA Matrix and Grouped ANOVA operators. (I don't know how to obtain a Performance Vector, which is required for the ANOVA operator.) But I don't see how to get what I need using some combination of these and other operators. Is RM not intended for this kind of inferential significance testing?
Different but related question: Assume I have one dependent variable Y, 2 independent variables A & B, and a 2x2 experimental design: Can RM calculate the 3 ANOVA F values to indicate whether variable A, and/or B, and or their AxB interaction, have statistically significant effects on Y?
Thanks for any and all tips!