Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
Stacking: probabilities instead of label?
spitfire_ch
Member Posts: 38 Maven
Hi,
I was experimenting with stacking and noticed that the only output the base learners provide to the "stacking model learner" is the final label. Don't most base learners contain more information than just the final label? More specifically, couldn't one pass probabilities instead of the final label?
E.g. if the leaf of choice in a decision tree contains 3 positive and 2 negative cases, pass 3/5 instead of P. That way, each "guess" by the base models would automatically be weighted. If model 1 is sure about a result, the others is not (and predicts a different outcome), then the prediction of model A would be favored.
Best regards
Hanspeter
I was experimenting with stacking and noticed that the only output the base learners provide to the "stacking model learner" is the final label. Don't most base learners contain more information than just the final label? More specifically, couldn't one pass probabilities instead of the final label?
E.g. if the leaf of choice in a decision tree contains 3 positive and 2 negative cases, pass 3/5 instead of P. That way, each "guess" by the base models would automatically be weighted. If model 1 is sure about a result, the others is not (and predicts a different outcome), then the prediction of model A would be favored.
Best regards
Hanspeter
0
Answers
I fully agree that passing the confidences in addition or even instead could definitely improve the quality of the complete model. I suppose that the original paper only passed the predictions and we probably sticked to this description. In order to not break compatibility and allow those different options, I would suggest to add a new parameter which allows to choose between "predictions only", "confidences only", or "predictions and confidences".
Thanks for sending this in. Cheers,
Ingo
thanks for your reply. Making this optional totally makes sense. That would also allow to directly investigate whether predictions vs. confidence do make a difference, and to do some more tweaking of the model in development.
Cheers,
Hanspeter