The RapidMiner community is on read-only mode until further notice. Technical support via cases will continue to work as is. For any urgent licensing related requests from Students/Faculty members, please use the Altair academic forum here.
Variable importance in deep learning and what to do with it?
fstarsinic
Member Posts: 20 Contributor II
in Help
What does one learn from variable importance?
What might one change in the model, based on what they see?
If you see variables that seem "very important" at the top that you know are not important, does that mean it's a candidate for "attribute removal" or "weight reduction" or...?
Example: I have a few category attributes that are hierarchical. if the upper(est) parent category has high importance, does it really need to be there at all, if the lower categories are the ones that really tell the story? Seems to me it's telling me i can get rid of that feature/attribute and that perhaps the model is relying too much on the upper level category to make predictions.
Yes, i know I should try removing it to see what happens but in general I'm wondering how should variable importance be interpreted?
What might one change in the model, based on what they see?
If you see variables that seem "very important" at the top that you know are not important, does that mean it's a candidate for "attribute removal" or "weight reduction" or...?
Example: I have a few category attributes that are hierarchical. if the upper(est) parent category has high importance, does it really need to be there at all, if the lower categories are the ones that really tell the story? Seems to me it's telling me i can get rid of that feature/attribute and that perhaps the model is relying too much on the upper level category to make predictions.
Yes, i know I should try removing it to see what happens but in general I'm wondering how should variable importance be interpreted?
Tagged:
0
Answers
So in general these variable importance measures are either based on heuristics or are generated empirically by selectively removing attributes and determining the proportional loss in predictive power. @mschmitz might know the actual method being using "under the hood" for the native DL algorithm.
In any case, I think the bottom line is that you always need to take them with a bit of a grain of salt and you may want to look at some of the other operators like "Explain Predictions" to explore what is happening for any given set of attribute values.
And I agree with @jacobcybulski that it is always a good idea to play around with your input attributes manually a bit if they have clear relationships and you are trying to get a better understanding of what is going on (like in your example of a multi-level hierarchical attribute set).
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts