"Final prediction in bagging algorithm"

adrian_crouchadrian_crouch Member Posts: 8 Contributor II
edited June 2019 in Help
Hello RM community,

I'm not certain whether I'm wrong but I always thought that the bagging meta algorithm should select the final prediction on the basis of a majority vote (in classification). While averaging the numeric confidences generated by the individual models for a label value this would mean that the final confidence may not directly map to the final prediction.

Lets say we have three models that are aggregated and the models predict confidences of 0.4., 0.4 and 0.9 for class 'A' and 0.6, 0.6, 0.1 respectively for class 'B' for a given example in a binominal classification. When averaging these confidences, class 'A' would get a confidence of 0.567 and class 'B' 0.433. In a majority voting approach I would however expect 'B' as the finally predicted class as it was 2 times predicted by the three models while class 'A' was predicted only once.

This does not correlate with the implementation in the BaggingModel (version 5.3.008). There it is the label value for the highest averaged confidence that is finally chosen - which for the example above was 'A' due to the higher confidence of 0.567.

Could someone tell me if I made a mistake with my thinking here?
Many thanks,
Adrian
Tagged:

Answers

  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 2,316  RM Data Scientist
    Hi Adrian,

    it simply comes down to weighted or unweighted average. I think both are useful. Brimans original RF implementation used unweighted.

    ~Martin
    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • adrian_crouchadrian_crouch Member Posts: 8 Contributor II
    You may be right: in case confidences were averaged and multiplied with the weight that comes from the number of times a label value was predicted then my assumption holds. But when looking into BaggingModel's implementation I can't find anything that deals with weights in this context (and so it's no wonder the result does not conform with my expectation).
    So I don't exactly get the point. Am I misinterpreting something or is it indeed a bug in the bagging implementation?
Sign In or Register to comment.