"Neural net: hidden layer size"

ambrul11ambrul11 Member Posts: 2 Contributor I
edited June 2019 in Help
I’m trying to determine better number of nodes at a layer (just a single layer for simplicity): 1,2,3..10 nodes?   
If I try 1, 2 or 3 nodes I get following results:

Error Number of nodes
0.4090900341097636 1
0.41810851436813457 2
0.4172135316516921 3

Now, if I only try 2 or 3 nodes I get following results:

Error Number of nodes
0.4252654926064915 2
0.4188150897563617 3

Why do error (validation performance) changes?    Process is below:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.015">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.3.015" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="generate_data" compatibility="5.3.015" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30">
        <parameter key="number_examples" value="1000"/>
      </operator>
      <operator activated="true" class="loop_parameters" compatibility="5.3.015" expanded="true" height="76" name="Loop Parameters" width="90" x="313" y="30">
        <list key="parameters">
          <parameter key="Set Macro.value" value="1,2,3"/>
        </list>
        <process expanded="true">
          <operator activated="true" class="set_macro" compatibility="5.3.015" expanded="true" height="76" name="Set Macro" width="90" x="45" y="30">
            <parameter key="macro" value="L1"/>
            <parameter key="value" value="3"/>
          </operator>
          <operator activated="true" class="x_validation" compatibility="5.3.015" expanded="true" height="112" name="Validation" width="90" x="313" y="30">
            <parameter key="number_of_validations" value="5"/>
            <parameter key="sampling_type" value="shuffled sampling"/>
            <parameter key="use_local_random_seed" value="true"/>
            <parameter key="local_random_seed" value="1"/>
            <process expanded="true">
              <operator activated="true" class="neural_net" compatibility="5.3.015" expanded="true" height="76" name="Neural Net" width="90" x="179" y="30">
                <list key="hidden_layers">
                  <parameter key="Layer1" value="%{L1}"/>
                </list>
              </operator>
              <connect from_port="training" to_op="Neural Net" to_port="training set"/>
              <connect from_op="Neural Net" from_port="model" to_port="model"/>
              <portSpacing port="source_training" spacing="0"/>
              <portSpacing port="sink_model" spacing="0"/>
              <portSpacing port="sink_through 1" spacing="0"/>
            </process>
            <process expanded="true">
              <operator activated="true" class="apply_model" compatibility="5.3.015" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
                <list key="application_parameters"/>
              </operator>
              <operator activated="true" class="performance_regression" compatibility="5.3.015" expanded="true" height="76" name="Performance" width="90" x="282" y="30">
                <parameter key="root_mean_squared_error" value="false"/>
                <parameter key="relative_error_lenient" value="true"/>
              </operator>
              <connect from_port="model" to_op="Apply Model" to_port="model"/>
              <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
              <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
              <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
              <portSpacing port="source_model" spacing="0"/>
              <portSpacing port="source_test set" spacing="0"/>
              <portSpacing port="source_through 1" spacing="0"/>
              <portSpacing port="sink_averagable 1" spacing="0"/>
              <portSpacing port="sink_averagable 2" spacing="0"/>
            </process>
          </operator>
          <operator activated="true" class="log" compatibility="5.3.015" expanded="true" height="76" name="Log" width="90" x="581" y="30">
            <list key="log">
              <parameter key="Err" value="operator.Validation.value.performance"/>
              <parameter key="Nodes" value="operator.Set Macro.value.macro_value"/>
            </list>
          </operator>
          <connect from_port="input 1" to_op="Set Macro" to_port="through 1"/>
          <connect from_op="Set Macro" from_port="through 1" to_op="Validation" to_port="training"/>
          <connect from_op="Validation" from_port="averagable 1" to_op="Log" to_port="through 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="source_input 2" spacing="0"/>
          <portSpacing port="sink_performance" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
        </process>
      </operator>
      <connect from_op="Generate Data" from_port="output" to_op="Loop Parameters" to_port="input 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
    </process>
  </operator>
</process>

Answers

  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,249 RM Data Scientist
    The answer is simple: Random Seed.

    Simply do your test with a local random seed and everything is the same.

    Be careful: This variation shows you the inherent statistical error of your model. A high fluctuation here means you have an unstable model
    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • ambrul11ambrul11 Member Posts: 2 Contributor I
    Thank you.  This is the first thing I checked: random seed.  However, I only used it in validation.  After adding it to neural net as well everything seems to work just fine. 
Sign In or Register to comment.