feature weightage vs domain inputs.

Thiru · March 2022

hi all,

When I am trying to use 'explain predictions' - it comes out with various weightage of features which varies with selection of algorithm as well.

For eg: going for kNN - will choose feature A, feature B, feature C, feature D as top 3.

1. However my domain knowledge says feature D is the most important one. in that case
selection of kNN ( for which feature D is not important ) will do the job even if it gives good accuracy during training and testing?

2. or in the above scenario - should I go for model say: SVM - which naturally consider feature D as most important attribute ? , but the performance of SVM is less comparin with kNN for the given data set during training
and testing.

can I have some clarity on how to approach.. particularly when there is conflict in order of preference by weightage sugessted by explain prediction operator while comparing with domain inputs. thanks.

regards
thiru

BalazsBarany · March 2022

Hi!

Explain Predictions and feature weighting are diagnostic tools, and your models are tools to achieve your goal, too. Don't overestimate the precision of Explain Predictions and feature weights, a complex model will have complex interactions between attributes.

Is it easy to or hard to get all the features at the same time without missing values? Are you interested in accuracy or in an explainable model? Might your attributes have some potential for discriminating against people? And so on.

Sometimes our domain knowledge betrays us or it is just too simplistic. That's why we use machine learning. A, B, C probably contain additional knowledge and they help improve the model beyond just looking at D.

All this said: Use the model (after proper validation) that solves your problem best, however the problem is defined.

Regards,
Balázs

MartinLiebig · March 2022

Hi,

keep also in mind that Explain Prediction explains the prediction, not the label. So it helps you to understand what 'the model things about the world'. If the model is a bad approximation of the world in first place it does not help.

Also one should really think about what the result of explain prediction means.

BR,

Martin

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

feature weightage vs domain inputs.

Answers