Most relevant attribute within the classification of a specific instance?

marcopomarcopo Member Posts: 14 Contributor II
Hello,
Does anyone know a way to identify the most relevant attribute within the classification of a specific instance?

Of course, it is possible to see the most important attributes for the model itself. But the values of the attributes varies and perhaps the most important attribute for the model is not the most important for the classification of a specific instance.

Thanks a lot

Marco

Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,503 RM Data Scientist
    Hi marco,

    to be honest i think this is hardly possible for a generic model. It might be possible for some models, but it is very hard to judge on individual relenvance (however we define that) for a true multivariate method.
    If you have a like: If  50 < Age < 79 && Gender=="male" && TransactionValue > 100 from a decision tree - What would you assign as relevance? In the end the combination of it made the result..

    Tough not. Maybe you can get it from some models, but definitly nut for all.


    ~Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • marcopomarcopo Member Posts: 14 Contributor II
    Thank you Martin, you are right. But a regression model should work in most of the cases until the regression coefficient is not too big.  The attribute with the biggest amount should be the most relevant for the prediction. But how to extract the formula and insert the values from the instance?

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,503 RM Data Scientist
    Hi,

    do you mean linear regression? In this case you might be right, even though i do not know a way to do this by hearth. The formula is given in the model of linear regression and the coefficients are in the weight vector. So weights to data, join and generate attribute might work?

    For a general regression model this is still something hard to do.

    ~Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • JEdwardJEdward RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 578 Unicorn
    Well a logistic regression would give you prediction between 0 & 1 for classification problems and is quite straightforward as a formula (if you use the Weka logistic regression & not the KLR in the RapidMiner core)
    And because the formula for each record is pretty simple (weight1 * att1, weight2 * att2,... ) you can turn this into a calculation for each attribute to generate the results using loops.  (Other formulae generating models are possible, but once you get over 100 support vectors per attribute you get a bit blurry eyed & error checking is difficult). 

    Short version: yes, it's possible but you'd need to break the scoring of each model down into the individual parts & really only works well with Weka Logistic Regression. 

    'I've also used Weight of Evidence tranforms before to generate record scorecards which then (when you generate a logistic regression from it) mean you can see for each example which attribute for a specific instance was the most important to the model. 
    http://rapid-i.com/rapidforum/index.php/topic,9047.msg30446.html
Sign In or Register to comment.