🎉 🎉. RAPIDMINER 9.8 IS OUT!!! 🎉 🎉

RapidMiner 9.8 continues to innovate in data science collaboration, connectivity and governance

CLICK HERE TO DOWNLOAD

forward selection and backward elimination

AD2019AD2019 Member, University Professor Posts: 13  University Professor
I ran a multiple regression model on a dataset having 15 variables first using the "forward selection" nested operator, and then using the "backward elimination" nested operator.  I got dramatically different models.  the first had 3 independent variables, the second had 8  IVs.  why such a bid difference.  I realize the serial elimination or addition of IVs may yield local optima, but is it common to get such wildly different "optimal" models for the same dataset?  How can training yield such dramatically different trained models?
thanks in advance,
AD

Best Answers

Answers

  • AD2019AD2019 Member, University Professor Posts: 13  University Professor
    thank you for the response.  I did increase the number of speculative iterations to get around the issue of local minima, but this option is only available for forward selection.  The backward elimination algorithm does not have this option and my suspicion is that it is getting stuck in a local optimum whereas the forward selection (with speculative iteration set to 30) is getting around the local optimum problem.
  • AD2019AD2019 Member, University Professor Posts: 13  University Professor
    my apologies.  you are correct.  Backward elimination does have the speculative option.  I ran forward and backward with speculative iterations set to 30 and still get very different models.  Three IVs in one direction, 8 in the other.  I guess this is okay if the objective is prediction - "i don't care what the IVs are as long as prediction is good", but is kind of disturbing if you are building a model to understand the contribution of IVs.  Sometimes the inner workings of RapidMiner are inscrutable.  In another regression model, I had set alpha to 0.01 for feature selection using the t-test, and RM produced IVs with a p-value of 0.05.  I didn't understand that one.
Sign In or Register to comment.