Weighting and nominal attributes

kypexin · May 2018

Hello there rapidminers

Small question on the process which is used for weighting attributes within auto model process.

It has a section which processes nominals, namely, performs dummy coding:

Screenshot 2018-05-30 165840png

The question is, for what reason this is done specifically for weighting process?

How then one should interpret weighting results?

For example, here are results from IP traffic classification, with and without dummy coding; as one can see, for binominal categories weights are exactly the same in values, but how to interpret certain chosen values included in the first case (all false except for cat_spam = true)?

Screenshot 2018-05-30 170558png Weights with dummy coding Screenshot 2018-05-30 170629png Weights without dummy coding

(kindly tagging @IngoRM)

MartinLiebig · May 2018

Hey,

keep in might that these weights are pearsons rho's. So you can't throw this method on nominals and need to do the conversion to dummy coding.

Cheers!

Martin

kypexin · May 2018

Aah yes exactly

Still, there's a question of interpretation, namely, I struggle with putting into explanation of relation between these true/false values and label. Does in my example 'cat_reputation = false' support or contradict 'label = true'? Or the other way around, based on a rather low correlation value from the corr. matrix (0.099), it is just 'the most important predictor' among others, while still quite weak?

MartinLiebig · May 2018

Hi,

i think it should support it, if i got it right. But it's normalized to 1, so it's all relative to the highest influence factor

Best,

Martin

sgenzer · May 2018

tagging @IngoRM if he's available...

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

Weighting and nominal attributes

Answers