Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

Gradient Boosted Trees: get full list of feature importances

phivuphivu Member Posts: 34 Maven
edited December 2018 in Help

Hi RapidMiner,

 

I'm using Gradient Boosted Trees (GBT) for a binary classifier with ~ 500 features, the result is good but I also want to see a full list of feature/variable importances instead of only 10 in the output model description (as in the screenshot attached). I noticed there's a similar question last year: http://community.rapidminer.com/t5/Product-Ideas/Gradient-Boosted-Trees-extract-feature-importance/idi-p/33066

 

Is it possible to do this in the current RM 7.5.001? Or do you have any idea of how to get this from the output GBT model? If you want my process and data to check, let me know and I will send it individually to you.

 

Thank you very much for your help!

 

P/S: I'm using a licenced version of RM Studio Large under my company account: chungsd@stee.stengg.com 

 

Best,

phivu

 

GBT-output-description.pngGBT-output-model-description

Best Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,529 RM Data Scientist
    Solution Accepted

    Dear phivu,

     

    This is being handled in 7.4 i think. The GBT has now a port called wei. This contains the very same list as a rapidminer weight object. That can be used to select attributes using Select by Weights or be transformed into an example set using weights to data.

     

    Best,

    Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,529 RM Data Scientist
    Solution Accepted

Answers

  • phivuphivu Member Posts: 34 Maven

    Oh, I got it. Thank you very much Martin!

  • PireheloPirehelo Member Posts: 12 Contributor II

    Hi,

    Could you please explain what is the basis for ranking the importance of attributes in the GBT? For example, is it based on information gain or does it use a backward propagation/forward elimination approach such as the one SelectAttribute operator does? I would appreciate your answers. I would appreciate even more if you could provide me with an article or a webpage (hopefully from rapidminer documentation) that explains the mathematical logic for ranking the attribute importance in Gradient Boosted Trees (GBT)

     

    Thanks,

  • phivuphivu Member Posts: 34 Maven

    @Pirehelokan, you should tag  @mschmitz in your question, so that he can be alerted. Thanks Martin for your help. :)

Sign In or Register to comment.