forward selection and backward elimination

AD2019 · October 2019

I ran a multiple regression model on a dataset having 15 variables first using the "forward selection" nested operator, and then using the "backward elimination" nested operator. I got dramatically different models. the first had 3 independent variables, the second had 8 IVs. why such a bid difference. I realize the serial elimination or addition of IVs may yield local optima, but is it common to get such wildly different "optimal" models for the same dataset? How can training yield such dramatically different trained models?
thanks in advance,
AD

varunm1 · October 2019

Hello @AD2019

Yes, you can get highly varying results for both types of selections. The reason is the parameter settings in these operators. If you used a forward selection, and the operator is adding each variable one by one based on the improvement and then it finds there is no improvement (Stuck in local optima) then it stops. The number of speculative rounds (helps avoid local optima) helps you bypass stopping based on one decision round. Stopping behavior and a maximal number of selections and eliminations also decides the number of attributes available after selection.

Image: https://us.v-cdn.net/6030995/uploads/editor/8c/itag3xf72kpb.png

Below is my dataset that has 408 attributes where forward selection selected 8 attributes and backward elimination selected 401 attributes with default settings.

Image: https://us.v-cdn.net/6030995/uploads/editor/uk/z5imi4tiwz5f.jpg

Hope this helps. Please inform if you need more information.

varunm1 · October 2019

The backward elimination algorithm does not have this option and my suspicion is that it is getting stuck in a local optimum

@AD2019
Can you check again? Backward elimination also has this option.

AD2019 · October 2019

thank you for the response. I did increase the number of speculative iterations to get around the issue of local minima, but this option is only available for forward selection. The backward elimination algorithm does not have this option and my suspicion is that it is getting stuck in a local optimum whereas the forward selection (with speculative iteration set to 30) is getting around the local optimum problem.

AD2019 · October 2019

my apologies. you are correct. Backward elimination does have the speculative option. I ran forward and backward with speculative iterations set to 30 and still get very different models. Three IVs in one direction, 8 in the other. I guess this is okay if the objective is prediction - "i don't care what the IVs are as long as prediction is good", but is kind of disturbing if you are building a model to understand the contribution of IVs. Sometimes the inner workings of RapidMiner are inscrutable. In another regression model, I had set alpha to 0.01 for feature selection using the t-test, and RM produced IVs with a p-value of 0.05. I didn't understand that one.

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

forward selection and backward elimination

Best Answers

Be Safe. Follow precautions and Maintain Social Distancing

Be Safe. Follow precautions and Maintain Social Distancing

Answers