# "Interpretation of attribute weights"

Member Posts: 157 Guru
edited May 2019 in Help
I've been using EvolutionaryWeighting to optimize attribute weights with the NearestNeighbors learner, and I'm hoping to get some clarity on how to interpret the values of attribute weights.

(For my processes, I have a numerical label, all the numerical attributes have been normalized, and there are one or two polynominal attributes.)

1) I've tried running it both with bounded_mutation=TRUE (which constrains weights to between 0 and 1) and bounded_mutation=FALSE, where weights can vary freely.  Is bounded_mutation=TRUE essentially just a rescaling of the attribute weights to fit in the [0,1] interval (so they are functionally equivalent)?  Or is there something other side effect of this setting that I should be aware of?

2) WIth bounded_mutation=FALSE, the weights can become negative.  Does the sign of the weight value have any particular meaning (e.g. does that mean the attribute is negatively correlated with the label)?  Or is it just the absolute value of the weight that indicates its importance?

3) With bounded_mutation=TRUE, it appears to be the case, at least after several generations, that there is an attribute with weight=1, and one with weight=0.  Does the weight=0 imply that the attribute is not used at all in computing the neighbors?

Thanks,
Keith
Tagged:

• RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
Hi Keith.

1) If values have been normalized they should be functionally equivalent is their absolute values is normalized between [0;1].
2) As long as all examples are multiplied with the same weight vector their distance does not change if one of the weights is negative or positive but with the same absolute value.
3) Yes.

Greetings,
Sebastian
• Member Posts: 157 Guru
Revisiting this topic...

1) Is it true that the rescaling done with bounded_mutation=TRUE means you will always have at least one attribute weight = 1 and at least one other weight  = 0 (assuming you have 2 or more attributes with different weights)?

I ask because I was monitoring the intermediate weights during a run of Evolutionary Weighting, logging the intermediate best-so-far weights every 1 generation, and the final intermediate weights were:

attr1  0.061610969
attr2  0.623057776
attr3  0.240515559

but the actual weights returned from Evolutionary Weighting were:

attr1 0
attr2 1
attr3 0.318649226

which is a rescaling of each weight by calculating  (Wgt - MinWgt) / (MaxWgt - MinWgt).

I originally set bounded_mutation=TRUE  to avoid problems of interpreting negative weights.  But it was not obvious to me that by setting it, an attribute weight would always be forced to zero.  Given the answer to 3) in the earlier response, this would be a problem for the Nearest Neighbor learner, since a weight of zero would mean that the attribute is essentially ignored, rather than just given a low value.  Correct?

If I'm on the right track, I think what I really want to make interpreting weights easier is for weights to be rescaled by Wgt / MaxWgt so that the largest weight is always 1, and the others are proportional to that, but will not be zero if the original weight was non-zero.

If my understanding is correct, I think the best way for me to proceed is to set bounded_mutation=FALSE, and deal with interpretation issues of any negative weights and rescaling outside of the RM process.

2) As a side issue, would it make sense for EvolutionaryWeighting to always return nonnegative weights, or are there other circumstances/learners where negative weights are not equivalent to positive weights?

Thanks as always,
Keith
• Member Posts: 157 Guru
Following up my own question...

Some further investigation has shown that it's normalize_weights, not bounded_mutation that does the conversion to the 0-1 scale.  Sorry for that misunderstanding.

However, I am still a little confused, then, on the difference between normalize_weights and bounded_mutation, and how they interact.

Consider the following process:
`<operator name="Root" class="Process" expanded="yes">    <operator name="ExampleSetGenerator" class="ExampleSetGenerator">        <parameter key="target_function"	value="polynomial"/>    </operator>    <operator name="EvolutionaryWeighting" class="EvolutionaryWeighting" expanded="yes">        <parameter key="normalize_weights"	value="false"/>        <operator name="LinearRegression" class="LinearRegression">            <parameter key="keep_example_set"	value="true"/>        </operator>        <operator name="ModelApplier" class="ModelApplier">        </operator>        <operator name="Performance" class="Performance">        </operator>    </operator></operator>`
If I run this process twice, once with normalize_weights=TRUE, once with normalize_weights=FALSE, I get two sets of weights that clearly correspond to each other (the normalize_weights=TRUE values are norm_wgt = (abs(Wgt)--Max(abs(wgt)) / (Max(abs(wgt)) - Min(abs(wgt))).  But if I run it with bounded_mutations=TRUE (and normalize_weights=FALSE), the weights aren't even in the same rank order as the previous runs:

Attrib    A          B            C
att1      0.156 0.000 0.555
att2      7.666 0.998 0.759
att3    -7.684 1.000 0.104
att4    -0.929 0.103 0.224
att5    -3.942 0.503 0.246

A: Normalize_weights=FALSE
B: Normalize_weights=TRUE
C: Bounded_mutations=TRUE (Norm_wgts=False)

Something's obviously different is happening with bounded_mutations, but I don't understand enough of what's being done behind the scenes to really interpret these results.  Can someone help?

Thanks,
Keith
• Member Posts: 1 Contributor I
Hi,

Any thoughts on this? I am experiencing the same issue

Thanks,
Jason
• RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
Hi guys,
bounded really means bounded During the evolutionary optimization process each individum is restricted to values between 0 and 1. This might cause different results than an unbounded mutation, since values can be inverted (5 becomes -5 if weight is - 1). This might improve performance for some learners, since a separating boundary gets wider.
Normalization destroys this effect, since only the absolut values are used.

Greetings,
Sebastian