Why is the Confusion matrix generated by Performance Vector and the Predictive model is different?

glybenttaglybentta Member Posts: 6 Contributor I
edited November 2018 in Help

I am using Gradient Boosted Trees for my dataset. The process output shows the Performance Vector which gives the accuracy and confusion matrix, and Gradient Boosted Model which gives the model metrics, Confusion matrix, Variable importance, model summary and scoring history. But both the confusion matrix are different. Which should confusion matrix should I consider to evaluate my model?

Best Answer

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,503 RM Data Scientist
    Solution Accepted

    Hi,

     

    the confusion matrix in the model are _training_ errors. So you should usually work on the Performance Vector, not the Gradient Boosted Model values. These are sometimes interesting to have a  look on overfitting (e.g. add more complexity or not).

     

    ~Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany

Answers

  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder

    > These are sometimes interesting to have a  look on overfitting (e.g. add more complexity or not).

     

    True, but in general I am in the school of "just forget training errors completely".  They create more damage than anything else :smileywink:

     

    Have a nice weekend,

    Ingo

Sign In or Register to comment.