Multiple Stacking

mustafa_mert_ce · October 2016

Hi;

What I would like to do is to have multiple stackings so here is the deal;

I have some base learners in my stacking. As a stacked learner I use a neural network. What I'd like to do is to use several seperate neural networks and have their outputs voted.

So Base Learners --> Different Neural Networks ---> Averaging ---> Output.

I tried adding a "Vote" operator in Stacking's Stacked Learner side, putting Neural Nets in it but I keep having Attribute Name collisions. How can I realise this scheme?

What I want to do is multiple stacking, like taking base learners, giving their output to different neural networks, and taking those neural network outputs, averaging them and taking as the result.

MartinLiebig · October 2016

Hi,

sure. There are I think two ways to do it. A combination of Stacking and Vote or use three Stackings (maybe switch roles), join and use Generate Aggreagtion to average the results.

The vorting one is attached.

~Martin

<?xml version="1.0" encoding="UTF-8"?><process version="7.2.003">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="7.2.003" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="retrieve" compatibility="7.2.003" expanded="true" height="68" name="Retrieve Deals-Testset" width="90" x="112" y="34">
        <parameter key="repository_entry" value="//Samples/data/Deals-Testset"/>
      </operator>
      <operator activated="true" class="vote" compatibility="7.2.003" expanded="true" height="68" name="Vote" width="90" x="313" y="34">
        <process expanded="true">
          <operator activated="true" class="stacking" compatibility="7.2.003" expanded="true" height="68" name="Stacking" width="90" x="112" y="34">
            <process expanded="true">
              <operator activated="true" class="naive_bayes" compatibility="7.2.003" expanded="true" height="82" name="Naive Bayes" width="90" x="112" y="136"/>
              <operator activated="true" class="parallel_decision_tree" compatibility="7.2.003" expanded="true" height="82" name="Decision Tree" width="90" x="112" y="289"/>
              <connect from_port="training set 1" to_op="Naive Bayes" to_port="training set"/>
              <connect from_port="training set 2" to_op="Decision Tree" to_port="training set"/>
              <connect from_op="Naive Bayes" from_port="model" to_port="base model 1"/>
              <connect from_op="Decision Tree" from_port="model" to_port="base model 2"/>
              <portSpacing port="source_training set 1" spacing="0"/>
              <portSpacing port="source_training set 2" spacing="63"/>
              <portSpacing port="source_training set 3" spacing="0"/>
              <portSpacing port="sink_base model 1" spacing="0"/>
              <portSpacing port="sink_base model 2" spacing="0"/>
              <portSpacing port="sink_base model 3" spacing="0"/>
            </process>
            <process expanded="true">
              <operator activated="true" class="h2o:deep_learning" compatibility="7.2.000" expanded="true" height="82" name="Deep Learning (2)" width="90" x="112" y="34">
                <enumeration key="hidden_layer_sizes">
                  <parameter key="hidden_layer_sizes" value="50"/>
                  <parameter key="hidden_layer_sizes" value="50"/>
                </enumeration>
                <enumeration key="hidden_dropout_ratios"/>
                <list key="expert_parameters"/>
                <description align="center" color="transparent" colored="false" width="126">rect</description>
              </operator>
              <connect from_port="stacking examples" to_op="Deep Learning (2)" to_port="training set"/>
              <connect from_op="Deep Learning (2)" from_port="model" to_port="stacking model"/>
              <portSpacing port="source_stacking examples" spacing="0"/>
              <portSpacing port="sink_stacking model" spacing="0"/>
            </process>
          </operator>
          <operator activated="true" class="stacking" compatibility="7.2.003" expanded="true" height="68" name="Stacking (2)" width="90" x="112" y="136">
            <process expanded="true">
              <operator activated="true" class="naive_bayes" compatibility="7.2.003" expanded="true" height="82" name="Naive Bayes (2)" width="90" x="112" y="136"/>
              <operator activated="true" class="parallel_decision_tree" compatibility="7.2.003" expanded="true" height="82" name="Decision Tree (2)" width="90" x="112" y="289"/>
              <connect from_port="training set 1" to_op="Naive Bayes (2)" to_port="training set"/>
              <connect from_port="training set 2" to_op="Decision Tree (2)" to_port="training set"/>
              <connect from_op="Naive Bayes (2)" from_port="model" to_port="base model 1"/>
              <connect from_op="Decision Tree (2)" from_port="model" to_port="base model 2"/>
              <portSpacing port="source_training set 1" spacing="0"/>
              <portSpacing port="source_training set 2" spacing="0"/>
              <portSpacing port="source_training set 3" spacing="0"/>
              <portSpacing port="sink_base model 1" spacing="0"/>
              <portSpacing port="sink_base model 2" spacing="0"/>
              <portSpacing port="sink_base model 3" spacing="0"/>
            </process>
            <process expanded="true">
              <operator activated="true" class="h2o:deep_learning" compatibility="7.2.000" expanded="true" height="82" name="Deep Learning (3)" width="90" x="112" y="34">
                <enumeration key="hidden_layer_sizes">
                  <parameter key="hidden_layer_sizes" value="50"/>
                  <parameter key="hidden_layer_sizes" value="50"/>
                </enumeration>
                <enumeration key="hidden_dropout_ratios"/>
                <list key="expert_parameters"/>
                <description align="center" color="transparent" colored="false" width="126">rect</description>
              </operator>
              <connect from_port="stacking examples" to_op="Deep Learning (3)" to_port="training set"/>
              <connect from_op="Deep Learning (3)" from_port="model" to_port="stacking model"/>
              <portSpacing port="source_stacking examples" spacing="0"/>
              <portSpacing port="sink_stacking model" spacing="0"/>
            </process>
          </operator>
          <operator activated="true" class="stacking" compatibility="7.2.003" expanded="true" height="68" name="Stacking (3)" width="90" x="112" y="238">
            <process expanded="true">
              <operator activated="true" class="naive_bayes" compatibility="7.2.003" expanded="true" height="82" name="Naive Bayes (3)" width="90" x="112" y="136"/>
              <operator activated="true" class="parallel_decision_tree" compatibility="7.2.003" expanded="true" height="82" name="Decision Tree (3)" width="90" x="112" y="289"/>
              <connect from_port="training set 1" to_op="Naive Bayes (3)" to_port="training set"/>
              <connect from_port="training set 2" to_op="Decision Tree (3)" to_port="training set"/>
              <connect from_op="Naive Bayes (3)" from_port="model" to_port="base model 1"/>
              <connect from_op="Decision Tree (3)" from_port="model" to_port="base model 2"/>
              <portSpacing port="source_training set 1" spacing="0"/>
              <portSpacing port="source_training set 2" spacing="0"/>
              <portSpacing port="source_training set 3" spacing="0"/>
              <portSpacing port="sink_base model 1" spacing="0"/>
              <portSpacing port="sink_base model 2" spacing="0"/>
              <portSpacing port="sink_base model 3" spacing="0"/>
            </process>
            <process expanded="true">
              <operator activated="true" class="h2o:deep_learning" compatibility="7.2.000" expanded="true" height="82" name="Deep Learning (4)" width="90" x="112" y="34">
                <enumeration key="hidden_layer_sizes">
                  <parameter key="hidden_layer_sizes" value="50"/>
                  <parameter key="hidden_layer_sizes" value="50"/>
                </enumeration>
                <enumeration key="hidden_dropout_ratios"/>
                <list key="expert_parameters"/>
                <description align="center" color="transparent" colored="false" width="126">rect</description>
              </operator>
              <connect from_port="stacking examples" to_op="Deep Learning (4)" to_port="training set"/>
              <connect from_op="Deep Learning (4)" from_port="model" to_port="stacking model"/>
              <portSpacing port="source_stacking examples" spacing="0"/>
              <portSpacing port="sink_stacking model" spacing="0"/>
            </process>
          </operator>
          <connect from_port="training set 1" to_op="Stacking" to_port="training set"/>
          <connect from_port="training set 2" to_op="Stacking (2)" to_port="training set"/>
          <connect from_port="training set 3" to_op="Stacking (3)" to_port="training set"/>
          <connect from_op="Stacking" from_port="model" to_port="base model 1"/>
          <connect from_op="Stacking (2)" from_port="model" to_port="base model 2"/>
          <connect from_op="Stacking (3)" from_port="model" to_port="base model 3"/>
          <portSpacing port="source_training set 1" spacing="0"/>
          <portSpacing port="source_training set 2" spacing="0"/>
          <portSpacing port="source_training set 3" spacing="0"/>
          <portSpacing port="source_training set 4" spacing="0"/>
          <portSpacing port="sink_base model 1" spacing="0"/>
          <portSpacing port="sink_base model 2" spacing="0"/>
          <portSpacing port="sink_base model 3" spacing="0"/>
          <portSpacing port="sink_base model 4" spacing="0"/>
        </process>
      </operator>
      <connect from_op="Retrieve Deals-Testset" from_port="output" to_op="Vote" to_port="training set"/>
      <connect from_op="Vote" from_port="model" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>

mustafa_mert_ce · October 2016

Thanks! Just a quick question, do I need to enclose this vote in a Cross-Validation to ensure doing cross validation on this entire ensemble (whole vote)?
What I want to do is to do cross validation on just the outputs of vote, not the inner stackings. I don't want to minimize the RMSE of base learners in stackings, I only want to minimize the RMSE for the vote as suggested in BelKor's Pragmatic Chaos.

mustafa_mert_ce · October 2016

Thanks! Just a quick question, do I need to enclose this vote in a Cross-Validation to ensure doing cross validation on this entire ensemble (whole vote)?
What I want to do is to do cross validation on just the outputs of vote, not the inner stackings. I don't want to minimize the RMSE of base learners in stackings, I only want to minimize the RMSE for the vote as suggested in BelKor's Pragmatic Chaos.

MartinLiebig · October 2016

x-val is estimating a performance of the method to generate a model. Your method to generate a model includes Stacking and Voting, so everything needs to be in an X-Val.

~Martin

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

Multiple Stacking

Answers