"How the

radone · November 2009

Hello,

using the "Optimize Weights (Evolutionary)" I have found optimal weights and exported them to a file. Its structure is:


<?xml version="1.0" encoding="windows-1250"?>
<attributeweights version="5.0beta">
    <weight name="Attrib_1" value="0.4805528186159622"/>
    <weight name="Attrib_2" value="0.257652703956798"/>
    ...
</attributeweights>

I can load these weights using:


<operator activated="true" class="read_weights" expanded="true" height="60" name="Read Weights" width="90" x="112" y="390">
        <parameter key="attribute_weights_file" value="weight.wgt"/>
</operator>

How can I apply these weights to a learner to get the same results as got from "Optimize Weights (Evolutionary)" learning process?

In "Read Weights" documentation is referenced "AttributeWeightsApplier", but I cannot find such an operator in RM 5.0. I have also found "Scale by weight" but I am not sure if the behaviour is the same as is in "Optimize Weights (Evolutionary)" applied.

Any help would be really appreciated,

Radone

land · November 2009

Hi Radone,
unfortunately we didn't find the time to adapt the documentation. *sigh*
But yes, scale by weight is exactly what the optimization does. But this will only work with learners which are sensitive to the scale of the numerical variables. For example the LinearRegression would only adapt the coefficient and deliver the same result, while the NearestNeighbour learner would act different because of the changed distance.

Greetings,
Sebastian

radone · November 2009

Unfortunately,
the result from "Optimize Weights (Evolutionary)" gave me 47.10 % success rate. When I repeat this experiment with weighted input and exact same XValid and training model (see below) I got only 39.03% +/- 0.27% (the result is even worse than without weighting 43.16% +/- 1.96%). Therefore I suppose there is something wrong with it.

"Optimize Weights (Evolutionary)" input example set (z-transform normalized) are:
1: [-3.258...2.452]; mean =-0.000
2: [-2.217...2.425]; mean =-0.000
...

"Optimize Weights (Evolutionary)" output example set are:
1: [-9.775...7.358]; mean =0.000
2: [-3.385...3.704]; mean =0.000
...

if I compute well the weights should be:
1: 3.000
2: 1.5268

but the "Optimize Weights (Evolutionary)".weights returned are:
1: 1.0
2: 0.5075538111704915
...

Do I understand something incorrectly?

Thank you for your time,
Radone


<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.0">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" expanded="true" name="Root">
    <process expanded="true" height="557" width="748">
      <operator activated="true" class="read_weights" expanded="true" height="60" name="Read Weights" width="90" x="112" y="390">
        <parameter key="attribute_weights_file" value="weight_47.wgt"/>
      </operator>
      <operator activated="true" class="read_aml" expanded="true" height="60" name="ExampleSource" width="90" x="112" y="120">
        <parameter key="attributes" value="data_percent_05.aml"/>
      </operator>
      <operator activated="true" class="normalize" expanded="true" height="94" name="Normalize" width="90" x="112" y="255"/>
      <operator activated="true" class="scale_by_weights" expanded="true" height="76" name="Scale by Weights" width="90" x="266" y="372"/>
      <operator activated="true" class="x_validation" expanded="true" height="112" name="XValid (3)" width="90" x="514" y="345">
        <parameter key="number_of_validations" value="3"/>
        <process expanded="true" height="751" width="496">
          <operator activated="true" class="support_vector_machine_libsvm" expanded="true" height="76" name="Learner (3)" width="90" x="179" y="30">
            <parameter key="C" value="10.0"/>
            <parameter key="cache_size" value="50"/>
            <list key="class_weights"/>
          </operator>
          <connect from_port="training" to_op="Learner (3)" to_port="training set"/>
          <connect from_op="Learner (3)" from_port="model" to_port="model"/>
          <portSpacing port="source_training" spacing="0"/>
          <portSpacing port="sink_model" spacing="0"/>
          <portSpacing port="sink_through 1" spacing="0"/>
        </process>
        <process expanded="true" height="751" width="496">
          <operator activated="true" class="apply_model" expanded="true" height="76" name="Applier (3)" width="90" x="45" y="30">
            <list key="application_parameters"/>
          </operator>
          <operator activated="true" class="performance_classification" expanded="true" height="76" name="Performance (3)" width="90" x="246" y="30">
            <parameter key="accuracy" value="true"/>
            <parameter key="classification_error" value="true"/>
            <list key="class_weights"/>
          </operator>
          <connect from_port="model" to_op="Applier (3)" to_port="model"/>
          <connect from_port="test set" to_op="Applier (3)" to_port="unlabelled data"/>
          <connect from_op="Applier (3)" from_port="labelled data" to_op="Performance (3)" to_port="labelled data"/>
          <connect from_op="Performance (3)" from_port="performance" to_port="averagable 1"/>
          <portSpacing port="source_model" spacing="0"/>
          <portSpacing port="source_test set" spacing="0"/>
          <portSpacing port="source_through 1" spacing="0"/>
          <portSpacing port="sink_averagable 1" spacing="0"/>
          <portSpacing port="sink_averagable 2" spacing="0"/>
        </process>
      </operator>
      <connect from_op="Read Weights" from_port="output" to_op="Scale by Weights" to_port="weights"/>
      <connect from_op="ExampleSource" from_port="output" to_op="Normalize" to_port="example set input"/>
      <connect from_op="Normalize" from_port="example set output" to_op="Scale by Weights" to_port="example set"/>
      <connect from_op="Scale by Weights" from_port="example set" to_op="XValid (3)" to_port="training"/>
      <connect from_op="XValid (3)" from_port="training" to_port="result 1"/>
      <connect from_op="XValid (3)" from_port="averagable 1" to_port="result 2"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <portSpacing port="sink_result 3" spacing="0"/>
    </process>
  </operator>
</process>

keith · November 2009

This is just a guess, but the ratios of the weights 1.5268/3.0 and 0.50755/1.0 are almost identical. For some learners, the absolute values of the weights aren't what's important, but the relative scale of them is. That is, if you multiply the evolutionarily optimized weights by three, you get the ones you were expecting, but the coefficient of three doesn't actually alter the relationships between data points for the SVM.

Also, I think that because of the generations and population members of the evolutionary step, you end up generating more folds of cross-validation by the time the weights are optimized than if you just run it through with a single set of weights. When you run a different process that just applies the weights, you're essentially at a different seed in the random number generator when creating the folds, so you are testing it on different cross-sections of data than what the weights were optimized on. It's expected that the reported performance would be different on an arbitrary data set than on the one the weights were optimized on. Or maybe I just talked myself into believing something that is totally inapplicable. :-) Maybe somebody smarter can back me up, or correct me.

Keith

land · November 2009

Hi Keith,
of course the XValidation depends on the random generation of the folds. Since the random number sequence is different in the different runs, the result might differ slightly. But this is a quite big difference, at least for large data sets. What's the standard deviation of the results? It usually gives a good impression how reliable the estimate is.
In general with growing size of data sets, the difference should vanish.

If the data is scaled differently after applying the weights, many learning algorithms might behave different. You are using a linear SVM and in spite of the fact that it returns a linear hyperplane which could be found in the exactly same relative (!) way in different scaled training data, I suspect, that the parameters like C have a different influence depending on the scale.

So I will take a look in this matter as soon as I can, but currently I'm learning to swim...

Greetings,
Sebastian

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

"How the

Answers