WARNING: Caught exception / Cannot reset network to a smaller learning rate

B_B_ Member Posts: 70 Maven
edited November 2018 in Help
Received this message while running a small neural net example.  Did not get a stack trace message.



<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.0">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.0.10" expanded="true" name="Process">
    <parameter key="parallelize_main_process" value="true"/>
    <process expanded="true" height="691" width="1090">
      <operator activated="true" class="retrieve" compatibility="5.0.10" expanded="true" height="60" name="Retrieve" width="90" x="45" y="30">
        <parameter key="repository_entry" value="//Samples/data/Sonar"/>
      </operator>
      <operator activated="true" class="generate_id" compatibility="5.0.10" expanded="true" height="76" name="Generate ID" width="90" x="45" y="165">
        <parameter key="create_nominal_ids" value="true"/>
      </operator>
      <operator activated="true" class="subprocess" compatibility="5.0.10" expanded="true" height="76" name="Make Imbalance" width="90" x="45" y="300">
        <process expanded="true" height="556" width="1139">
          <operator activated="true" class="nominal_to_numerical" compatibility="5.0.10" expanded="true" height="94" name="Nominal to Numerical (2)" width="90" x="112" y="120">
            <parameter key="attribute_filter_type" value="single"/>
            <parameter key="attribute" value="class"/>
            <parameter key="include_special_attributes" value="true"/>
          </operator>
          <operator activated="true" class="split_data" compatibility="5.0.10" expanded="true" height="94" name="Split Data (3)" width="90" x="246" y="165">
            <enumeration key="partitions">
              <parameter key="ratio" value="0.1"/>
              <parameter key="ratio" value="0.9"/>
            </enumeration>
            <parameter key="sampling_type" value="linear sampling"/>
          </operator>
          <operator activated="true" class="filter_examples" compatibility="5.0.10" expanded="true" height="76" name="Filter Examples (4)" width="90" x="380" y="255">
            <parameter key="condition_class" value="attribute_value_filter"/>
            <parameter key="parameter_string" value="class=1"/>
          </operator>
          <operator activated="true" class="filter_examples" compatibility="5.0.10" expanded="true" height="76" name="Filter Examples (3)" width="90" x="380" y="75">
            <parameter key="condition_class" value="attribute_value_filter"/>
            <parameter key="parameter_string" value="class&lt;1"/>
          </operator>
          <operator activated="true" class="append" compatibility="5.0.10" expanded="true" height="94" name="Append (2)" width="90" x="658" y="179"/>
          <connect from_port="in 1" to_op="Nominal to Numerical (2)" to_port="example set input"/>
          <connect from_op="Nominal to Numerical (2)" from_port="example set output" to_op="Split Data (3)" to_port="example set"/>
          <connect from_op="Split Data (3)" from_port="partition 1" to_op="Filter Examples (3)" to_port="example set input"/>
          <connect from_op="Split Data (3)" from_port="partition 2" to_op="Filter Examples (4)" to_port="example set input"/>
          <connect from_op="Filter Examples (4)" from_port="example set output" to_op="Append (2)" to_port="example set 1"/>
          <connect from_op="Filter Examples (3)" from_port="example set output" to_op="Append (2)" to_port="example set 2"/>
          <connect from_op="Append (2)" from_port="merged set" to_port="out 1"/>
          <portSpacing port="source_in 1" spacing="0"/>
          <portSpacing port="source_in 2" spacing="0"/>
          <portSpacing port="sink_out 1" spacing="0"/>
          <portSpacing port="sink_out 2" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="multiply" compatibility="5.0.10" expanded="true" height="94" name="Multiply" width="90" x="45" y="390"/>
      <operator activated="true" class="split_data" compatibility="5.0.10" expanded="true" height="94" name="Split Data" width="90" x="45" y="525">
        <enumeration key="partitions">
          <parameter key="ratio" value="0.5"/>
          <parameter key="ratio" value="0.5"/>
        </enumeration>
        <parameter key="sampling_type" value="stratified sampling"/>
      </operator>
      <operator activated="true" class="numerical_to_polynominal" compatibility="5.0.10" expanded="true" height="76" name="Numerical to Polynominal" width="90" x="246" y="210">
        <parameter key="attribute_filter_type" value="single"/>
        <parameter key="attribute" value="class"/>
        <parameter key="include_special_attributes" value="true"/>
      </operator>
      <operator activated="true" class="generate_weight_stratification" compatibility="5.0.10" expanded="true" height="76" name="Generate Weight (Stratification)" width="90" x="380" y="210"/>
      <operator activated="true" class="nominal_to_numerical" compatibility="5.0.10" expanded="true" height="94" name="Nominal to Numerical (3)" width="90" x="514" y="210">
        <parameter key="attribute_filter_type" value="single"/>
        <parameter key="attribute" value="class"/>
        <parameter key="include_special_attributes" value="true"/>
      </operator>
      <operator activated="true" class="neural_net" compatibility="5.0.10" expanded="true" height="76" name="NNModel" width="90" x="715" y="165">
        <list key="hidden_layers">
          <parameter key="1" value="-1"/>
          <parameter key="2" value="-1"/>
          <parameter key="3" value="-1"/>
        </list>
        <parameter key="training_cycles" value="5000"/>
        <parameter key="learning_rate" value="0.49999999999999994"/>
        <parameter key="momentum" value="0.59"/>
      </operator>
      <operator activated="true" class="numerical_to_polynominal" compatibility="5.0.10" expanded="true" height="76" name="Numerical to Polynominal (3)" width="90" x="313" y="345">
        <parameter key="attribute_filter_type" value="single"/>
        <parameter key="attribute" value="class"/>
        <parameter key="include_special_attributes" value="true"/>
      </operator>
      <operator activated="true" class="generate_weight_stratification" compatibility="5.0.10" expanded="true" height="76" name="Generate Weight (3)" width="90" x="447" y="345"/>
      <operator activated="true" class="nominal_to_numerical" compatibility="5.0.10" expanded="true" height="94" name="Nominal to Numerical (5)" width="90" x="583" y="345">
        <parameter key="attribute_filter_type" value="single"/>
        <parameter key="attribute" value="class"/>
        <parameter key="include_special_attributes" value="true"/>
      </operator>
      <operator activated="true" class="apply_model" compatibility="5.0.10" expanded="true" height="76" name="Apply Model" width="90" x="782" y="300">
        <list key="application_parameters"/>
      </operator>
      <operator activated="true" class="numerical_to_binominal" compatibility="5.0.10" expanded="true" height="76" name="Numerical to Binominal" width="90" x="916" y="210">
        <parameter key="attribute_filter_type" value="single"/>
        <parameter key="attribute" value="prediction(class)"/>
        <parameter key="include_special_attributes" value="true"/>
        <parameter key="min" value="-1000.0"/>
        <parameter key="max" value="0.5"/>
      </operator>
      <operator activated="true" class="numerical_to_polynominal" compatibility="5.0.10" expanded="true" height="76" name="Numerical to Polynominal (2)" width="90" x="246" y="30">
        <parameter key="attribute_filter_type" value="single"/>
        <parameter key="attribute" value="class"/>
        <parameter key="include_special_attributes" value="true"/>
      </operator>
      <operator activated="true" class="generate_weight_stratification" compatibility="5.0.10" expanded="true" height="76" name="Generate Weight (2)" width="90" x="380" y="30"/>
      <operator activated="true" class="nominal_to_numerical" compatibility="5.0.10" expanded="true" height="94" name="Nominal to Numerical (4)" width="90" x="514" y="30">
        <parameter key="attribute_filter_type" value="single"/>
        <parameter key="attribute" value="class"/>
        <parameter key="include_special_attributes" value="true"/>
      </operator>
      <operator activated="true" class="optimize_parameters_grid" compatibility="5.0.10" expanded="true" height="94" name="Optimize Parameters (Grid)" width="90" x="648" y="30">
        <list key="parameters">
          <parameter key="NNTrain.learning_rate" value="[.1;.9;3;linear]"/>
          <parameter key="NNTrain.momentum" value="[0.0;1.0;4;linear]"/>
        </list>
        <parameter key="parallelize_optimization_process" value="true"/>
        <process expanded="true" height="619" width="1139">
          <operator activated="true" class="x_validation" compatibility="5.0.10" expanded="true" height="112" name="Validation" width="90" x="45" y="30">
            <parameter key="number_of_validations" value="3"/>
            <parameter key="parallelize_training" value="true"/>
            <parameter key="parallelize_testing" value="true"/>
            <process expanded="true" height="637" width="553">
              <operator activated="true" class="neural_net" compatibility="5.0.10" expanded="true" height="76" name="NNTrain" width="90" x="246" y="30">
                <list key="hidden_layers"/>
                <parameter key="learning_rate" value="0.6333333333333333"/>
                <parameter key="momentum" value="1.0"/>
                <parameter key="use_local_random_seed" value="true"/>
              </operator>
              <connect from_port="training" to_op="NNTrain" to_port="training set"/>
              <connect from_op="NNTrain" from_port="model" to_port="model"/>
              <portSpacing port="source_training" spacing="0"/>
              <portSpacing port="sink_model" spacing="0"/>
              <portSpacing port="sink_through 1" spacing="0"/>
            </process>
            <process expanded="true" height="637" width="553">
              <operator activated="true" class="apply_model" compatibility="5.0.10" expanded="true" height="76" name="Apply Model (2)" width="90" x="45" y="30">
                <list key="application_parameters"/>
              </operator>
              <operator activated="true" class="performance_regression" compatibility="5.0.10" expanded="true" height="76" name="Performance" width="90" x="299" y="30">
                <parameter key="absolute_error" value="true"/>
                <parameter key="normalized_absolute_error" value="true"/>
                <parameter key="squared_error" value="true"/>
              </operator>
              <connect from_port="model" to_op="Apply Model (2)" to_port="model"/>
              <connect from_port="test set" to_op="Apply Model (2)" to_port="unlabelled data"/>
              <connect from_op="Apply Model (2)" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
              <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
              <portSpacing port="source_model" spacing="0"/>
              <portSpacing port="source_test set" spacing="0"/>
              <portSpacing port="source_through 1" spacing="0"/>
              <portSpacing port="sink_averagable 1" spacing="0"/>
              <portSpacing port="sink_averagable 2" spacing="0"/>
            </process>
          </operator>
          <operator activated="true" class="log" compatibility="5.0.10" expanded="true" height="76" name="Log" width="90" x="581" y="30">
            <parameter key="filename" value="R:\DataRMDep\churntest.log"/>
            <list key="log">
              <parameter key="train_learn_rate" value="operator.NNTrain.parameter.learning_rate"/>
              <parameter key="train_momentum" value="operator.NNTrain.parameter.momentum"/>
              <parameter key="performance" value="operator.Validation.value.performance"/>
            </list>
          </operator>
          <connect from_port="input 1" to_op="Validation" to_port="training"/>
          <connect from_op="Validation" from_port="averagable 1" to_op="Log" to_port="through 1"/>
          <connect from_op="Log" from_port="through 1" to_port="performance"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="source_input 2" spacing="0"/>
          <portSpacing port="sink_performance" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="set_parameters" compatibility="5.0.10" expanded="true" height="60" name="Set Parameters" width="90" x="782" y="30">
        <list key="name_map">
          <parameter key="NNTrain" value="NNModel"/>
        </list>
      </operator>
      <connect from_op="Retrieve" from_port="output" to_op="Generate ID" to_port="example set input"/>
      <connect from_op="Generate ID" from_port="example set output" to_op="Make Imbalance" to_port="in 1"/>
      <connect from_op="Make Imbalance" from_port="out 1" to_op="Multiply" to_port="input"/>
      <connect from_op="Multiply" from_port="output 1" to_op="Split Data" to_port="example set"/>
      <connect from_op="Multiply" from_port="output 2" to_op="Numerical to Polynominal (2)" to_port="example set input"/>
      <connect from_op="Split Data" from_port="partition 1" to_op="Numerical to Polynominal" to_port="example set input"/>
      <connect from_op="Split Data" from_port="partition 2" to_op="Numerical to Polynominal (3)" to_port="example set input"/>
      <connect from_op="Numerical to Polynominal" from_port="example set output" to_op="Generate Weight (Stratification)" to_port="example set input"/>
      <connect from_op="Generate Weight (Stratification)" from_port="example set output" to_op="Nominal to Numerical (3)" to_port="example set input"/>
      <connect from_op="Nominal to Numerical (3)" from_port="example set output" to_op="NNModel" to_port="training set"/>
      <connect from_op="NNModel" from_port="model" to_op="Apply Model" to_port="model"/>
      <connect from_op="Numerical to Polynominal (3)" from_port="example set output" to_op="Generate Weight (3)" to_port="example set input"/>
      <connect from_op="Generate Weight (3)" from_port="example set output" to_op="Nominal to Numerical (5)" to_port="example set input"/>
      <connect from_op="Nominal to Numerical (5)" from_port="example set output" to_op="Apply Model" to_port="unlabelled data"/>
      <connect from_op="Apply Model" from_port="labelled data" to_op="Numerical to Binominal" to_port="example set input"/>
      <connect from_op="Apply Model" from_port="model" to_port="result 2"/>
      <connect from_op="Numerical to Binominal" from_port="example set output" to_port="result 1"/>
      <connect from_op="Numerical to Polynominal (2)" from_port="example set output" to_op="Generate Weight (2)" to_port="example set input"/>
      <connect from_op="Generate Weight (2)" from_port="example set output" to_op="Nominal to Numerical (4)" to_port="example set input"/>
      <connect from_op="Nominal to Numerical (4)" from_port="example set output" to_op="Optimize Parameters (Grid)" to_port="input 1"/>
      <connect from_op="Optimize Parameters (Grid)" from_port="parameter" to_op="Set Parameters" to_port="parameter set"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <portSpacing port="sink_result 3" spacing="0"/>
    </process>
  </operator>
</process>

Answers

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    seems to be a bug in the library the Neural Net depends on. Try change you learning rate parameter settings.

    Greetings,
      Sebastian
  • chaosbringerchaosbringer Member Posts: 21 Contributor II
    Hi,
    to which library do you refer?
    Any chances, that it get fixed soon?

    Thank you
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    sorry, but currently this isn't on our priority list. As I said: try using a different learning rate parameter. Or even better: Try other learning algorithms if possible. In my experience Neural Nets are powerful multitools, but can be outperformed by more specialized algorithms.

    Greetings,
      Sebastian
  • chaosbringerchaosbringer Member Posts: 21 Contributor II
    Hi,
    sorry from continuing this thread, but i have a few questions on this problem again.
    I obtain the "too small learning rate"-error from the rapidminer/joone-net and the weka-multiperceptron, too.

    So, is it really an implementation bug if it occurs in both neural net implementations?
    Why does the learning-rate converge to 0, even if i did not set the decay-parameter? It this normal in back-propagation?
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    as long as I know: No. Doesn't make much sense in my eyes at all.
    Unfortunately I don't have the time to dive into the code to solve that problem...But if there's any solution for that, please tell me, I will include into the code immediately.

    Greetings,
      Sebastian
  • daviddavid Member Posts: 3 Contributor I
    I don't know if my solution applys to this problem but however....

    I got the same error message with a dataset and for me it turned out to be a zero-divide prioblem. I have odds in my dataset and when i calculate percentages for them by doing 1/odds and the odds turned out to be a zero i got this error while trying to do a parameter optimization. Just removing the zero-odds from the dataset solved my problem. These showed up as∞ in my example dataset.
  • tdhollantdhollan Member Posts: 2 Contributor I
    Dear all

    I am experiencing the same problem. It seems to occur in about one out of two attempts to use rapidminer for neural network training. The data set i am using is clean, contains no missing or infinite values. Changing the learning rate does not help either.

    At least in my experience, this is a major bug that may prevent me from using rapidminer further if there are no proper workarounds. Any developments in relation to this topic?

    Many thanks in advance

    Thomas
  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi Thomas,

    thanks for bringing this issue back to our minds. The learning rate is automatically decreased by dividing the rate by 2 if the calculated error is infinite - which might happen through a data error or since nothing has been learned at all. If this happens too often, the learning rate becomes too small (close to 0) and the message is presented.

    Since the neural network performs perfectly well on our testing data sets (we hardly use NN in our projects for several reasons) we unfortunately cannot reproduce this and check if there is any problem here or if this indeed is intended behaviour. Would it be possible to share a data set and a process with us so that we can try to reproduce the problem?

    If yes, please contact as under contact ( at ) rapid-i.com.

    Thanks for your support. Cheers,
    Ingo
  • tdhollantdhollan Member Posts: 2 Contributor I
    Hi Ingo

    Thanks for your prompt reply. I have sent you an email with a scenario attached.

    I have some more questions regarding the current neural net implementation:
    1) I have not been able to set up a network without hidden layer, following the instructions in the help box.
    2) It seems to me that the essential "weight decay" parameter is missing? The decay checkbox refers to the decrease in learning rate, not to the regularization of the weights (an important tuning parameter).
    3) Am i correct in that the activation function is not (yet) controllable by the user?
    4) I think it would be useful to make the number of hidden neurons controllable from parameter grid search operators.

    Best

    Thomas
  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi Thomas,

    thanks for sending in the bug report. Our developers will have a deeper look into that. And thanks also for the other questions and comments. The help text seems to refer to an older version of the operator, which was indeed more powerful (no hidden layers possible, activation function controllable) but was slow and delivered pretty bad prediction performances. It seems that those two functions have been removed (probably for some good reason) during the redesign.

    The controlling via parameter optimizations is not only impossible for the number of neurons but for all list based parameter settings. A solution for that is far from trivial but the issue is on our (evergrowing) todo list.

    Cheers,
    Ingo
  • churtadochurtado Member Posts: 1 Contributor I
    Hii everybody,  I had the same error message when training a forecast model using the Neural Net operator. I realized that the training dataset had a nulll label value, removed it and everything went well.
Sign In or Register to comment.