RapidMiner 9.8 Beta is now available

Be one of the first to get your hands on the new features. More details and downloads here:

GET RAPIDMINER 9.8 BETA

Feature Importance - logistic regression.

SA_HSA_H Member Posts: 26 Contributor II
edited December 2019 in Help
Could you please help me to know how to determine the feature importance in case of logistic regression.
Tagged:

Answers

  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,085   Unicorn
    Hi @SA_H,

    You can use Auto Model for that : 
     - Submit your data to Auto Model
     - Select "Logistic Regression" as model
     - In the results screen, click on "Weights" under "Logistic Regression" ==> you will see the feature importance



    Regards,

    Lionel
    sgenzer
  • kypexinkypexin Moderator, RapidMiner Certified Analyst, Member Posts: 286   Unicorn
    Hi @SA_H

    You can also open the model itself and have a look at the coefficients. Generally, the higher the coefficient, the more importance certain variable has. Also, the sign matters, as positive coefficients support positive prediction while negative coefficients contradict positive prediction. 
    sgenzer
  • SA_HSA_H Member Posts: 26 Contributor II
    Thank you lionelderkrikor  and kypexin for your reply. I am still confused, because I do believe that weights "coefficient" can not be interpreted directly to represent the weight of the independent variables
  • hughesfleming68hughesfleming68 Member Posts: 306   Unicorn
    edited January 4
    @SA_H,

    You can. Following on from what @kypexin has already mentioned. This applies to linear models and also linear SVM for example. In order to make it work, you need make several runs across your data and then aggregate the model weights. Often you will see a pattern where repeated attributes have high weightings relative to the others. You can then rank them based on their weights. Looking at one sample is rarely enough if you have lots of attributes. There is a weights to example set operator these days which makes this less painful. In the old days, it required parsing text files which was a lot of extra work to setup. Even then it was worth the effort to do so.

    There are also the feature selection operators, the most useful in my opinion being the Optimise feature selection evolutionary but don't ignore simple backward elimination. There are pros and cons to all approaches as you open yourself up to different ways of curve fitting. Auto Model is the newest way but understanding how the older operators work is a good use of time.
    sgenzerlionelderkrikor
Sign In or Register to comment.