Parameter Optimization, gives different results?

kayman · February 2017

Can someone point me in the right direction on what I am doing wrong here? I want to get the best learning and momentum parameter values but the results are slightly suprising.

I'm using an optimize grid analyzer, but if I take the best parameter output from the grid (87%) and apply these directly to exactly the same data the result is only 25%. This difference is rather big, and around the same using new runs on the data so it's consistent.

Below is the code I used, so i assume I somehow misconnected things but I can't get a hold on it ...

<?xml version="1.0" encoding="UTF-8"?><process version="7.3.000">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="7.3.000" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="retrieve" compatibility="7.3.000" expanded="true" height="68" name="Retrieve wordvector" width="90" x="112" y="34">
        <parameter key="repository_entry" value="../SampleData/wordvector"/>
      </operator>
      <operator activated="true" class="optimize_parameters_grid" compatibility="7.3.000" expanded="true" height="103" name="Optimize Parameters (Grid)" width="90" x="313" y="34">
        <list key="parameters">
          <parameter key="Neural Net.learning_rate" value="[0.1;0.9;5;linear]"/>
          <parameter key="Neural Net.momentum" value="[0.1;0.9;5;linear]"/>
        </list>
        <process expanded="true">
          <operator activated="true" class="concurrency:cross_validation" compatibility="7.3.000" expanded="true" height="145" name="Cross Validation" width="90" x="112" y="34">
            <process expanded="true">
              <operator activated="true" class="neural_net" compatibility="7.3.000" expanded="true" height="82" name="Neural Net" width="90" x="179" y="34">
                <list key="hidden_layers"/>
                <parameter key="training_cycles" value="100"/>
                <parameter key="learning_rate" value="0.1"/>
                <parameter key="momentum" value="0.1"/>
              </operator>
              <connect from_port="training set" to_op="Neural Net" to_port="training set"/>
              <connect from_op="Neural Net" from_port="model" to_port="model"/>
              <portSpacing port="source_training set" spacing="0"/>
              <portSpacing port="sink_model" spacing="0"/>
              <portSpacing port="sink_through 1" spacing="0"/>
            </process>
            <process expanded="true">
              <operator activated="true" class="apply_model" compatibility="7.3.000" expanded="true" height="82" name="Apply Model" width="90" x="179" y="85">
                <list key="application_parameters"/>
              </operator>
              <operator activated="true" class="performance_classification" compatibility="7.3.000" expanded="true" height="82" name="Performance" width="90" x="313" y="85">
                <list key="class_weights"/>
              </operator>
              <connect from_port="model" to_op="Apply Model" to_port="model"/>
              <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
              <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
              <connect from_op="Performance" from_port="performance" to_port="performance 1"/>
              <portSpacing port="source_model" spacing="0"/>
              <portSpacing port="source_test set" spacing="0"/>
              <portSpacing port="source_through 1" spacing="0"/>
              <portSpacing port="sink_test set results" spacing="0"/>
              <portSpacing port="sink_performance 1" spacing="0"/>
              <portSpacing port="sink_performance 2" spacing="0"/>
            </process>
          </operator>
          <operator activated="true" class="log" compatibility="7.3.000" expanded="true" height="82" name="Log" width="90" x="313" y="85">
            <list key="log">
              <parameter key="learning rate" value="operator.Neural Net.parameter.learning_rate"/>
              <parameter key="momentum" value="operator.Neural Net.parameter.momentum"/>
              <parameter key="performance" value="operator.Cross Validation.value.performance 1"/>
            </list>
          </operator>
          <connect from_port="input 1" to_op="Cross Validation" to_port="example set"/>
          <connect from_op="Cross Validation" from_port="performance 1" to_op="Log" to_port="through 1"/>
          <connect from_op="Log" from_port="through 1" to_port="performance"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="source_input 2" spacing="0"/>
          <portSpacing port="sink_performance" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
        </process>
      </operator>
      <connect from_op="Retrieve wordvector" from_port="output" to_op="Optimize Parameters (Grid)" to_port="input 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
    </process>
  </operator>
</process>

Thomas_Ott · February 2017

Can you share the word vector data?

kayman · February 2017

For some reason the zipped file gets removed each time. Any sugesstion on how to share?

Thomas_Ott · February 2017

Do you have it as CSV?

MartinLiebig · February 2017

Hey Kayman,

are you sure that the grid is not overtraining? Put a X-Val around the optimize to check this. Keep in mind that this needs to be done. The acc of Optimize can't be trusted.

~Martin

kayman · February 2017

Of course, good old CSV. Should have thought about that one :-)

Hope it goes through now

kayman · February 2017

Fair enough, I fully understand it is only giving an indication and not a foolproof golden solution, I just recently discovered the grid and before I was spending a huge amount of time doing this manually, so this looked like a huge timesaver to me. The data and process is far from optimized indeed, just trying to get a better understanding of the logic behind the operator.

Therefore, If I run the process like 10 times with 2 given parmeters, and the results are always comparable with a minor difference, wouldn't it be safe to assume the same results would be shown using the grid ?

If the grid tels me 5 times out of 5 it is around 85% accurate for a given parameter set, and entering the same parameters in the process tells me 5 times out of 5 it's around 20% accurate it seems the grid is giving me falls results. Note that I am using exact the same dataset in all scenario's, so even with crappy data the results should be still the same, even if they may be giving me a false indication of succes. Or am I overlooking something ?

MartinLiebig · February 2017

Hey Kayman,

you overlooked that the performance returned by optimize grid might be overtrained and only good on exactly this data set, not on others. Your difference is extreme so maybe there is something else happening. But overtraining is also part of hyperparameter tuning.

~Martin

Thomas_Ott · February 2017

The parameter optimization gives me a LR and Mom of 0.1 with an +86% accuracy. I plug that into a seperate CV with a NN @LR and @Mom of 0.1, and get 86% accuracy

Parameter Opt.png

Final Results.png

kayman · February 2017

Well, for some weird reason it is ok for me now also. I have no clue on why or how but now I get the same on both the flows after just re-entering the values.

Thanks for checking anyway, hope I didn't waste too much of your time!

swordf · October 2020

Is there still someone on this issue now?

I have the same problem, different results are presented using the suggested parameter by grid optimizer,

I don't understand why, even if it's not the same it should be close at least

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

Parameter Optimization, gives different results?

Answers