Optimize Deep Learning's network structure and parameters

phivu · January 2017

Hi RapidMiner,

I'm doing regression using "Deep Learning" operator, I have 480 input features (this is a predictive maintenance problem, each feature is a meter reading, and we want to apply regression to predict the next-time-to-fail of an asset). After training the "Deep Learning" operator, the Root Mean Square Error (RMSE) applied on the training datatset is still quite high (from 0.08 to 0.2), although the training dataset is normalized into [-1; 1]. I also tried a lot of network structures, including increasing the number of hidden layers (up to 15), and increasing the number of nodes in each hidden layer (up 1000 nodes/layer). In some cases, doing so even increases the RMSE on the training dataset, which means the model is under-fitting. I used the default values for the other deep learning parameters, including the adaptive learning rate and rectifier activation.

So do you have any piece of advices for this situation, or is there any way to optimize the network structure? (coz I already tried the "Optimize Parameters" operator for "Deep Learning" but could not find the operator's parameters for network structure). Or is there any way to make the deep learning operator fit the training data better?

Thank you very much for your help!

Best regards,
Phivu

Thomas_Ott · January 2017

I'm not sure what you mean by "coz I already tried the "Optimize Parameters" operator for "Deep Learning" but could not find the operator's parameters for network structure." You tried an Optimize Parameter operator, embedded a Cross Validation operator and inside a Deep Learning operator....and nothing showed up?

phivu · January 2017

Hi Thomas,

Yes. I attached the process and a screenshot here. In the screenshot, you can see the "hidden_layer_sizes" parameter is disabled when I choose the "Deep Learning" operator for optimization. So how can i enable it?

Thomas_Ott · January 2017

Huh, that's interesting. Try using Set Macros to inside the hidden layers and vary those values.

<?xml version="1.0" encoding="UTF-8"?><process version="7.3.001">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="7.3.001" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="optimize_parameters_grid" compatibility="7.3.001" expanded="true" height="166" name="Optimize Parameters (Grid)" width="90" x="447" y="34">
        <list key="parameters">
          <parameter key="Set Macro.value" value="[0;100;10;linear]"/>
          <parameter key="Set Macro (2).value" value="[0;100;10;linear]"/>
        </list>
        <process expanded="true">
          <operator activated="true" class="set_macro" compatibility="7.3.001" expanded="true" height="82" name="Set Macro" width="90" x="112" y="34">
            <parameter key="macro" value="layer1"/>
            <parameter key="value" value="0.0"/>
          </operator>
          <operator activated="true" class="set_macro" compatibility="7.3.001" expanded="true" height="82" name="Set Macro (2)" width="90" x="246" y="34">
            <parameter key="macro" value="layer2"/>
            <parameter key="value" value="0.0"/>
          </operator>
          <operator activated="true" class="concurrency:cross_validation" compatibility="7.3.001" expanded="true" height="145" name="Cross Validation" width="90" x="447" y="34">
            <process expanded="true">
              <operator activated="true" class="h2o:deep_learning" compatibility="7.3.000" expanded="true" height="82" name="Deep Learning" width="90" x="179" y="34">
                <enumeration key="hidden_layer_sizes">
                  <parameter key="hidden_layer_sizes" value="%{layer1}"/>
                  <parameter key="hidden_layer_sizes" value="%{layer2}"/>
                </enumeration>
                <enumeration key="hidden_dropout_ratios"/>
                <list key="expert_parameters"/>
                <list key="expert_parameters_"/>
              </operator>
              <connect from_port="training set" to_op="Deep Learning" to_port="training set"/>
              <connect from_op="Deep Learning" from_port="model" to_port="model"/>
              <portSpacing port="source_training set" spacing="0"/>
              <portSpacing port="sink_model" spacing="0"/>
              <portSpacing port="sink_through 1" spacing="0"/>
            </process>
            <process expanded="true">
              <operator activated="true" class="apply_model" compatibility="7.3.001" expanded="true" height="82" name="Apply Model (2)" width="90" x="112" y="34">
                <list key="application_parameters"/>
              </operator>
              <operator activated="true" class="performance_regression" compatibility="7.3.001" expanded="true" height="82" name="Performance (2)" width="90" x="246" y="34"/>
              <connect from_port="model" to_op="Apply Model (2)" to_port="model"/>
              <connect from_port="test set" to_op="Apply Model (2)" to_port="unlabelled data"/>
              <connect from_op="Apply Model (2)" from_port="labelled data" to_op="Performance (2)" to_port="labelled data"/>
              <connect from_op="Performance (2)" from_port="performance" to_port="performance 1"/>
              <connect from_op="Performance (2)" from_port="example set" to_port="test set results"/>
              <portSpacing port="source_model" spacing="0"/>
              <portSpacing port="source_test set" spacing="0"/>
              <portSpacing port="source_through 1" spacing="0"/>
              <portSpacing port="sink_test set results" spacing="0"/>
              <portSpacing port="sink_performance 1" spacing="0"/>
              <portSpacing port="sink_performance 2" spacing="0"/>
            </process>
          </operator>
          <operator activated="true" class="log" compatibility="7.3.001" expanded="true" height="82" name="Log" width="90" x="648" y="136">
            <parameter key="filename" value="D:\Predictive-Maintenance-Project\RapidMiner-Processes\SVM-Parameters\Fault1C-log-values-gamma-0.001-0.05-XV.txt.log"/>
            <list key="log">
              <parameter key="Count" value="operator.SVM.value.applycount"/>
              <parameter key="SVM C" value="operator.SVM.parameter.C"/>
              <parameter key="SVM gamma" value="operator.SVM.parameter.gamma"/>
              <parameter key="Testing Accuracy" value="operator.Validation.value.performance"/>
            </list>
          </operator>
          <connect from_port="input 1" to_op="Set Macro" to_port="through 1"/>
          <connect from_op="Set Macro" from_port="through 1" to_op="Set Macro (2)" to_port="through 1"/>
          <connect from_op="Set Macro (2)" from_port="through 1" to_op="Cross Validation" to_port="example set"/>
          <connect from_op="Cross Validation" from_port="model" to_port="result 1"/>
          <connect from_op="Cross Validation" from_port="example set" to_port="result 2"/>
          <connect from_op="Cross Validation" from_port="test result set" to_port="result 3"/>
          <connect from_op="Cross Validation" from_port="performance 1" to_op="Log" to_port="through 1"/>
          <connect from_op="Log" from_port="through 1" to_port="performance"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="source_input 2" spacing="0"/>
          <portSpacing port="sink_performance" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
          <portSpacing port="sink_result 3" spacing="0"/>
          <portSpacing port="sink_result 4" spacing="0"/>
        </process>
      </operator>
      <connect from_op="Optimize Parameters (Grid)" from_port="parameter" to_port="result 1"/>
      <connect from_op="Optimize Parameters (Grid)" from_port="result 1" to_port="result 2"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <portSpacing port="sink_result 3" spacing="0"/>
    </process>
  </operator>
</process>

cyborghijacker · August 2017

Hi Thomas

I have encountered the same issue, where the network hyperparameters are greyed out in the grid search UI. May I know how do I implement the method you have mentioned? Where do I place the Set Macros?

Thank you

Regards

Corse

Thomas_Ott · August 2017

Ah I think you have to toggle on 'List.'

cyborghijacker · August 2017

How would I do that? The options are greyed out actually (i.e. Grid/List)

FBT · August 2017

You need to select the parameters you would like to optimize first (the middle column on top). Once you have done that the selection between "Grid" and "List" becomes available.

cyborghijacker · August 2017

Yes it works with the other parameters, but in this case the Parameters of hidden_layer_sizes and hidden_dropout_ratios are greyed out, clicking on them yields no further selection options.

Has anyone else encountered this issue? Apart from this, doing a manual search is possible, but it will be a hair thinning exercise. I shuddered at that thought.

FBT · August 2017

Ok. Apologies, I did not read the full thread up until now. It is indeed a bit unusual that this is greyed out. Have you taken a look at the process that @Thomas_Ott posted above? In it, he set macros for the layer sizes within the "Optimize Parameters" operator, which are then used in the hidden layers panel of the "Deep Learning" operator, instead of using actual numerical values --> see screenshot.

If you do that, you should be able to select those said "Set Macros" operators within the optimization panel and enter the values that you would like to iterate through.

cyborghijacker · August 2017

No problem, and thank you for the clarification.

In fact, in Thomas' earlier video (https://www.youtube.com/watch?v=R5vPrTLMzng&list=PLLdyjsPklEJvpxLHZM-llt40FLw-qEiZO&index=1), around the 8:40 min mark, we also see that hidden_layers for Neural network is greyed out, although it was not highlighted.

I am still unsure how to use the Set Macros using the script Thomas posted. Where should it be placed? Nested within the Optimize Parameters operater before Cross Validation?

Regards

Ben

FBT · August 2017

You can just copy the XML of Thomas' process and paste it into your XML panel in RapidMiner. Then press the green checkmark and the process will appear in the process panel.

cyborghijacker · August 2017

Hi FBT

I see. In essence, its just creating the no. of Set Macro operators to be the no. of hidden layers? I'm assuming the Set Macros can be used instead of defining one Set Macro for one hidden layer. For the optimizing, I will still have to change the macro value ( = no. of nodes) manually correct?

Regards

Ben

FBT · August 2017

Hi,

not really. The whole point about using "Set Macro" operators is to make the parameters available within the "Optimize" operator in which you can set the values you would like to iterate through. Take a look at the screenshot:

I have renamed the two "Set Macro" operators from Thomas' process to make it a bit clearer in the optimization pane. You can now either use the grid selection by setting a range of values and the distance between values for the layer, or you directly define your desired values with a list (you would just need to check the "List" radio button).

Using the "Set Macros" operator with which several macros can be defined at once, brings you back to the problem of the selection being "greyed out" in the optimization pane. Hence you would need to use one "Set Macro" operator for every hidden layer you want to iterate through.

cyborghijacker · August 2017

Hi FBT

I see, I can get a clearer picture now, your first statement was key. Now I face another strange Problem Message. I defined 3 layers as per Set Macros connected to Cross-Validation with a Deep Learning operator within it.

Any chance this error could be diagnosed? Has it got to do with strings vs. integers in Rapidminer?

FBT · August 2017

Ok, you are right. Values set in the "Set Macro" operator are strings. This is usually not a problem in other contexts, as you can use eval() or parse() to convert it to a number. In your context, however, this does not work and we need to get a bit more creative. What seems to work is the following:

<?xml version="1.0" encoding="UTF-8"?><process version="7.5.003">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="7.5.003" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="optimize_parameters_grid" compatibility="7.5.003" expanded="true" height="166" name="Optimize Parameters (Grid)" width="90" x="380" y="34">
        <list key="parameters">
          <parameter key="Extract Macro - Layer 1.example_index" value="[1;6;6;linear]"/>
          <parameter key="Extract Macro - Layer 2.example_index" value="[1;6;6;linear]"/>
        </list>
        <process expanded="true">
          <operator activated="true" class="read_excel" compatibility="7.5.003" expanded="true" height="68" name="Read Excel" width="90" x="45" y="136">
            <parameter key="first_row_as_names" value="false"/>
            <list key="annotations">
              <parameter key="0" value="Name"/>
            </list>
            <list key="data_set_meta_data_information"/>
          </operator>
          <operator activated="true" class="extract_macro" compatibility="7.5.003" expanded="true" height="68" name="Extract Macro - Layer 1" width="90" x="179" y="136">
            <parameter key="macro" value="layer1"/>
            <parameter key="macro_type" value="data_value"/>
            <parameter key="attribute_name" value="LayerSize"/>
            <parameter key="example_index" value="6"/>
            <list key="additional_macros"/>
          </operator>
          <operator activated="true" class="extract_macro" compatibility="7.5.003" expanded="true" height="68" name="Extract Macro - Layer 2" width="90" x="313" y="136">
            <parameter key="macro" value="layer2"/>
            <parameter key="macro_type" value="data_value"/>
            <parameter key="attribute_name" value="LayerSize"/>
            <parameter key="example_index" value="6"/>
            <list key="additional_macros"/>
          </operator>
          <operator activated="true" class="concurrency:cross_validation" compatibility="7.5.003" expanded="true" height="145" name="Cross Validation" width="90" x="447" y="34">
            <process expanded="true">
              <operator activated="true" class="h2o:deep_learning" compatibility="7.5.000" expanded="true" height="82" name="Deep Learning" width="90" x="179" y="34">
                <enumeration key="hidden_layer_sizes">
                  <parameter key="hidden_layer_sizes" value="%{layer1}"/>
                  <parameter key="hidden_layer_sizes" value="%{layer2}"/>
                </enumeration>
                <enumeration key="hidden_dropout_ratios"/>
                <list key="expert_parameters"/>
                <list key="expert_parameters_"/>
              </operator>
              <connect from_port="training set" to_op="Deep Learning" to_port="training set"/>
              <connect from_op="Deep Learning" from_port="model" to_port="model"/>
              <portSpacing port="source_training set" spacing="0"/>
              <portSpacing port="sink_model" spacing="0"/>
              <portSpacing port="sink_through 1" spacing="0"/>
            </process>
            <process expanded="true">
              <operator activated="true" class="apply_model" compatibility="7.5.003" expanded="true" height="82" name="Apply Model (2)" width="90" x="112" y="34">
                <list key="application_parameters"/>
              </operator>
              <operator activated="true" class="performance_classification" compatibility="7.5.003" expanded="true" height="82" name="Performance" width="90" x="246" y="34">
                <list key="class_weights"/>
              </operator>
              <connect from_port="model" to_op="Apply Model (2)" to_port="model"/>
              <connect from_port="test set" to_op="Apply Model (2)" to_port="unlabelled data"/>
              <connect from_op="Apply Model (2)" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
              <connect from_op="Performance" from_port="performance" to_port="performance 1"/>
              <connect from_op="Performance" from_port="example set" to_port="test set results"/>
              <portSpacing port="source_model" spacing="0"/>
              <portSpacing port="source_test set" spacing="0"/>
              <portSpacing port="source_through 1" spacing="0"/>
              <portSpacing port="sink_test set results" spacing="0"/>
              <portSpacing port="sink_performance 1" spacing="0"/>
              <portSpacing port="sink_performance 2" spacing="0"/>
            </process>
          </operator>
          <operator activated="true" class="log" compatibility="7.5.003" expanded="true" height="82" name="Log" width="90" x="648" y="136">
            <list key="log">
              <parameter key="Count" value="operator.SVM.value.applycount"/>
              <parameter key="SVM C" value="operator.SVM.parameter.C"/>
              <parameter key="SVM gamma" value="operator.SVM.parameter.gamma"/>
              <parameter key="Testing Accuracy" value="operator.Validation.value.performance"/>
            </list>
          </operator>
          <connect from_port="input 1" to_op="Cross Validation" to_port="example set"/>
          <connect from_op="Read Excel" from_port="output" to_op="Extract Macro - Layer 1" to_port="example set"/>
          <connect from_op="Extract Macro - Layer 1" from_port="example set" to_op="Extract Macro - Layer 2" to_port="example set"/>
          <connect from_op="Cross Validation" from_port="model" to_port="result 1"/>
          <connect from_op="Cross Validation" from_port="example set" to_port="result 2"/>
          <connect from_op="Cross Validation" from_port="test result set" to_port="result 3"/>
          <connect from_op="Cross Validation" from_port="performance 1" to_op="Log" to_port="through 1"/>
          <connect from_op="Log" from_port="through 1" to_port="performance"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="source_input 2" spacing="0"/>
          <portSpacing port="sink_performance" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
          <portSpacing port="sink_result 3" spacing="0"/>
          <portSpacing port="sink_result 4" spacing="0"/>
        </process>
      </operator>
      <connect from_op="Optimize Parameters (Grid)" from_port="parameter" to_port="result 1"/>
      <connect from_op="Optimize Parameters (Grid)" from_port="result 1" to_port="result 2"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <portSpacing port="sink_result 3" spacing="0"/>
    </process>
  </operator>
</process>

This process is very similar to the one before, however, instead of using "Set Macro" operators, you start by reading in an Excel or CSV File that consists of one column --> "LayerSize". In this column you have the hidden layer sizes you want to iterate through. It is important that you read the values in as numerical types, otherwise you will run into the same issue as before. If in doubt, use the import wizard.

Now, instead of setting the macro values directly, you can extract them from the read-in file, which contains your layer sizes, with the "Extract Macro" operator. By setting the macro type to "data value" you get a new parameter called "example index". This is basically just the row number in your Excel sheet and luckily you can iterate through it within the Optimize operator.

My Excel sheet had six values for hidden layers, hence I am iterating through from 1 to 6. You will need to adapt accordingly, depending on how many values for hidden layers you have. You need to use one "Extract Macro" operator for each hidden layer you would like to iterate through. If the values you would like to iterate through for different hidden layers are not identical, you would need to create more than one Excel/CSV file and read them in, as shown in the process above.

cyborghijacker · August 2017

Hi FBT

Oh I see, I was trying to put eval() into the hidden layer size in the Deep Learning operator. So it was an issue with 'strings' being read by the Set Macro operator. I have implemented the workaround that you have suggested, and it works well, although sometimes the Deep Learning shows a Warning message about the macro not being defined.

On a separate note, do you know what the 'standardize' parameter in DL does? It says:

standardize (optional)

If enabled, automatically standardize the data. If disabled, the user must provide properly scaled input data.

Once again, thank you for your detailed clarifications on this topic.

Regards

FBT · August 2017

I believe that this parameter rescales your attributes to have a mean of 0 and and a standard deviation of 1. Also, it assumes that your data has a Gaussian distribution.

cyborghijacker · August 2017

I see, so its essentially an in-built Normalize operator with the Z-scaling method. I'm assuming the user can choose between a manual normalization or simply checking this parameter?

CkNN_Algo · May 2020

You can put READ EXCEL outside of the optimization block. Doing so prevents the file from being read over and over: helps your drive and execution speed.

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

Optimize Deep Learning's network structure and parameters

Answers