Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
Using Gradient Boosted Tree Output
kylejohnson
Member Posts: 7 Learner I
in Help
Hello,
New User here. Sorry if this has already been asked but I can't find an answer anywhere. The simple version of the question is how do I use the output model of the GBT? What do the numbers on the leafs mean? Why are there 60 trees in the model and how are they all used together in application?
To give a little background that may or may not be helpful, I am a stock trader and have constructed an indicator for short term price movement. This indicator works excellent sometimes and is not useful at others. I am trying to determine if there are patterns that can give me a better idea of when the indicator will work and when it wont. My attributes are all numerical values that are part of the indicator and the label is "yes" if that particular prediction of stock movement was useful. My ultimate goal is to use RapidMiner to find a way to figure out when to listen to my indicator and when not to and then to put that insight back into the trading indicator itself.
Thank you in advance for your time and insight,
Kyle
New User here. Sorry if this has already been asked but I can't find an answer anywhere. The simple version of the question is how do I use the output model of the GBT? What do the numbers on the leafs mean? Why are there 60 trees in the model and how are they all used together in application?
To give a little background that may or may not be helpful, I am a stock trader and have constructed an indicator for short term price movement. This indicator works excellent sometimes and is not useful at others. I am trying to determine if there are patterns that can give me a better idea of when the indicator will work and when it wont. My attributes are all numerical values that are part of the indicator and the label is "yes" if that particular prediction of stock movement was useful. My ultimate goal is to use RapidMiner to find a way to figure out when to listen to my indicator and when not to and then to put that insight back into the trading indicator itself.
Thank you in advance for your time and insight,
Kyle
0
Best Answers
-
Telcontar120 RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 UnicornGBT is an ensemble method so there are multiple trees by design--in fact it is a parameter setting so you can control the number of trees. No single tree is really useful or interpretable in this context. The entire set of trees must be used to make the prediction.
GBT is not a method that is suitable for simple explanations. If you want that then you can try the simpler Decision Tree operator, but you may see a significant deterioration in performance. Instead, if you want to use GBT then you will need to score future records in RapidMiner using that model and then relying on the prediction. These algorithms are somewhat "black box" in their nature.8 -
varunm1 Member Posts: 1,207 UnicornHello @kylejohnson
As mentioned by Telcontar120, there will be multiple tree build one after other based on the parameters set in the operator.
Working:
First the operator builds one decision tree and it can have multiple leaf nodes with a certain value. Each leaf node will calculate how far it is from the original values, this will be taken as an error. The next tree weights were adjusted in such a way that this error is minimized.
The outputs of one tree are not used as input to other, but the error from one tree is taken into consideration while initializing weights of the next tree so that it is built with less deviation from the original value. There is a simple video which explains this.
https://www.youtube.com/watch?v=ErDgauqnTHk
Hope this helps you get an understanding. @Telcontar120 correct me if there is any misconceptionRegards,
Varun
https://www.varunmandalapu.com/Be Safe. Follow precautions and Maintain Social Distancing
7
Answers
Thank you that makes more sense. Do I have the correct basic understanding (I apologize for the incorrect terminology):
When a new example is run through the model, it is put into "Tree 1" which gives it an output value "Leaf 1", then into "Tree 2" and given another output value "Leaf 2", until "Tree N" and "Leaf N". Then are all of the "Leaf Values" added up? How does the model arrive at a final output?
Again thank you in advance,
Kyle