Options

Gradient Boosted Tree don't show the dependent variable in the resulting trees

sbrnaesbrnae Member Posts: 2 Newbie
edited July 2023 in Help
Hello, i want to ask regarding Gradient Boosted model that i used for my study on corporate default risk. My dependent variable is default and non default and i use number 1 as default and 0 as non default. I already setup the data type as binominal for the default and non default. After i call the related operators such as select attributes, set roles and cross validation, all tree at the end of the result don't show the branches either it will become 1 or 0 as i assigned before. Below i share one of the Gradient Booted models 
Figure 1: Tree for Gradient Boosted Model

However, i tried on other models such as decision tree and random forest. It give  the desired result. Below is attachment of decision tree and random forest model 

Figure 2: Tree for Decision Tree

Figure 3: Tree for Random Forest

So, i want to ask what is the reason that the 0 and 1 that i assigned before dont show up at the end of the branches in Gradient Boosted model but show up on other models? and Is there any operator that i need to call in the process? I hope that anyone can help me to find any way to solve this problem. Everyone is open to give help. Thank you in advance.

Answers

  • Options
    jmerglerjmergler Administrator, Moderator, Employee, RapidMiner Certified Analyst, Member, University Professor Posts: 41 Guru
    edited July 2023
    Hi @sbrnae
    I don't think you are going to get what you are looking for with gradient boosting. These models are more difficult to interpret and I don't think that any data preparation you do before modeling will allow for that sort of output. With the decision tree or the random forest, each tree makes an independent prediction for the label. With a gradient boosting for a categorical prediction, it's more like an effect on the log-odds of the positive class, that can't easily be interpreted on its own. If you have several categories, it will create different trees with different positive classes. Sometimes in situations like this, people will use the GBT for its predictive power, but compare the performances and results with other models, and rely more on those other models for interpretability.
    Best,
    Jeff 
Sign In or Register to comment.