RapidMiner

Using a SVM Within a Stacked Model...

SOLVED
RM Certified Analyst

Re: Using a SVM Within a Stacked Model...

Hello - and thanks for your suggestion.

I tried using the operator you suggsted, but I still get the same error message I recieved using the standard SVM Operator:

"The operator expects the inner process to deliver a performance value".

If you have any other suggestions, my thanks, and I'd be happy to try them.

Berst wishes, Michael

RM Certified Analyst

Re: Using a SVM Within a Stacked Model...

 

Hi, and thanks for your suggestion, which I tried.  I still get the same error message.

I did some further experimenting and got some interesting results using the standard SVM Operator, and the Operator you suggested I try.

 

I'll talk you through the attached sceenshots:

 

1. "Revised Model Overview.png" - I took the Optimize Parameters Grid out of the process, and moved the "Cross Validation" operator to the top level of the process follwoing the "Generate weights...." Operator.  I tried this as the Error logging (attached in initial post on this topic) suggests that the problem may have had something to do with the "Optimize Paramters - Grid" Operator.

 

2. "Revised Model Detail Nr 1.png" - a vew of what is inside the "Cross Validation" Operator.

 

3. "Revised Model Detail Nr 2.png" - a vew of what is inside the "Stacking" Operator (see the above screen shot)

 

4. "Revised Model Detail Nr 3_Error_Message.png" - this is the error messager I now get when running the process - which is different than the error message I received when the "Optimize Paramters" operator was part of the process (see my initial post).

 

5. "Revised Model Detail Nr 4_Data_View.png" - this is what the data looks like prior to entering the "Stacking Operator" (i.e. right after the PCA Operator shown in "Revised Model Detail Nr 1.png".  "creditworthy" is the Label field, and it is binomial.

 

6. "Revised Model Detail Nr 4_Statistics_View.png".  We see that all values in the data are numeric, with the exception of "creditworthy", which is binomial.

 

7. "SVM Operator Information.png" - from the RM Documentation, which states that the SVM Operator can handle numeric attributes and a binomial Label.  It would therefore seem that the SVM operator should be able to handle the data.

 

8. "Customer_Credit_Risk_Solution_Nr_2_MM.rmp" - the revised process file.  The data the process uses was included in my initial post on this topic.

 

My investigations seem to indicate that using a SVM operator within a "Stacking" Operator can be problematic, but perhasps I am missing a further step.  Thanks for considering this, and would be happy to try other suggestions.

 

Best wishes, Michael

Attachments

RM Staff

Re: Using a SVM Within a Stacked Model...

Ha, i think i got it.


Try this:

<?xml version="1.0" encoding="UTF-8"?><process version="7.5.001">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="7.5.001" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="false" class="retrieve" compatibility="7.5.001" expanded="true" height="68" name="Retrieve Golf" width="90" x="112" y="238">
        <parameter key="repository_entry" value="//Samples/data/Golf"/>
      </operator>
      <operator activated="true" class="read_excel" compatibility="7.5.001" expanded="true" height="68" name="Read Excel" width="90" x="112" y="85">
        <parameter key="excel_file" value="C:\Users\Martin\Downloads\Customer_Credit_Risk_Data.xlsx"/>
        <parameter key="imported_cell_range" value="A1:V989"/>
        <parameter key="first_row_as_names" value="false"/>
        <list key="annotations">
          <parameter key="0" value="Name"/>
        </list>
        <list key="data_set_meta_data_information">
          <parameter key="0" value="foreignworker.true.polynominal.attribute"/>
          <parameter key="1" value="status.true.polynominal.attribute"/>
          <parameter key="2" value="credithistory.true.polynominal.attribute"/>
          <parameter key="3" value="purpose.true.polynominal.attribute"/>
          <parameter key="4" value="savings.true.polynominal.attribute"/>
          <parameter key="5" value="employmentsince.true.polynominal.attribute"/>
          <parameter key="6" value="otherdebtors.true.polynominal.attribute"/>
          <parameter key="7" value="property.true.polynominal.attribute"/>
          <parameter key="8" value="otherinstallments.true.polynominal.attribute"/>
          <parameter key="9" value="housing.true.polynominal.attribute"/>
          <parameter key="10" value="job.true.polynominal.attribute"/>
          <parameter key="11" value="phone.true.polynominal.attribute"/>
          <parameter key="12" value="duration.true.integer.attribute"/>
          <parameter key="13" value="creditamount.true.integer.attribute"/>
          <parameter key="14" value="installmentrate.true.integer.attribute"/>
          <parameter key="15" value="residencesince.true.integer.attribute"/>
          <parameter key="16" value="age.true.integer.attribute"/>
          <parameter key="17" value="numberofexsistingcredits.true.integer.attribute"/>
          <parameter key="18" value="numberofliablepeople.true.integer.attribute"/>
          <parameter key="19" value="gender.true.polynominal.attribute"/>
          <parameter key="20" value="creditworthy.true.polynominal.attribute"/>
          <parameter key="21" value="creditamout_per_month.true.numeric.attribute"/>
        </list>
      </operator>
      <operator activated="true" class="set_role" compatibility="7.5.001" expanded="true" height="82" name="Set Role" width="90" x="294" y="39">
        <parameter key="attribute_name" value="creditworthy"/>
        <parameter key="target_role" value="label"/>
        <list key="set_additional_roles"/>
      </operator>
      <operator activated="true" class="nominal_to_binominal" compatibility="7.5.001" expanded="true" height="103" name="Nominal to Binominal" width="90" x="476" y="34">
        <parameter key="attribute_filter_type" value="single"/>
        <parameter key="attribute" value="creditworthy"/>
        <parameter key="include_special_attributes" value="true"/>
      </operator>
      <operator activated="true" class="nominal_to_numerical" compatibility="7.5.001" expanded="true" height="103" name="Nom to Numeric" width="90" x="663" y="34">
        <parameter key="attribute_filter_type" value="subset"/>
        <parameter key="attributes" value="credithistory|employmentsince|foreignworker|gender|housing|job|otherdebtors|otherinstallments|phone|property|purpose|savings|status"/>
        <list key="comparison_groups"/>
      </operator>
      <operator activated="true" class="extract_macro" compatibility="7.5.001" expanded="true" height="68" name="Extract Macro" width="90" x="835" y="48">
        <parameter key="macro" value="num_Examples"/>
        <list key="additional_macros"/>
      </operator>
      <operator activated="true" class="generate_weight_stratification" compatibility="7.5.001" expanded="true" height="82" name="Generate Weight (Stratification)" width="90" x="985" y="45">
        <parameter key="total_weight" value="%{num_Examples}"/>
      </operator>
      <operator activated="true" class="optimize_parameters_grid" compatibility="7.5.001" expanded="true" height="145" name="Optimize Parameters" width="90" x="1217" y="136">
        <list key="parameters">
          <parameter key="Stacking KNN.k" value="[5;20;3;linear]"/>
        </list>
        <parameter key="error_handling" value="ignore error"/>
        <process expanded="true">
          <operator activated="false" class="concurrency:cross_validation" compatibility="7.5.001" expanded="true" height="145" name="Cross Validation" width="90" x="981" y="348">
            <process expanded="true">
              <operator activated="true" class="normalize" compatibility="7.5.001" expanded="true" height="103" name="z Score Normalize" width="90" x="79" y="185">
                <parameter key="attribute_filter_type" value="subset"/>
                <parameter key="attributes" value="age|creditamount|creditamout_per_month|duration"/>
              </operator>
              <operator activated="true" class="principal_component_analysis" compatibility="7.5.001" expanded="true" height="103" name="PCA 32 Attributes" width="90" x="296" y="43">
                <parameter key="dimensionality_reduction" value="fixed number"/>
                <parameter key="variance_threshold" value="0.9"/>
                <parameter key="number_of_components" value="32"/>
              </operator>
              <operator activated="true" class="stacking" compatibility="7.5.001" expanded="true" height="68" name="Stacking" width="90" x="407" y="280">
                <process expanded="true">
                  <operator activated="true" class="naive_bayes" compatibility="7.5.001" expanded="true" height="82" name="Stacking NB" width="90" x="419" y="172"/>
                  <operator activated="true" class="adaboost" compatibility="7.5.001" expanded="true" height="82" name="Ada Boost DT" width="90" x="419" y="35">
                    <parameter key="iterations" value="12"/>
                    <process expanded="true">
                      <operator activated="true" class="concurrency:parallel_decision_tree" compatibility="7.5.001" expanded="true" height="82" name="Stacking DT" width="90" x="707" y="75">
                        <parameter key="maximal_depth" value="15"/>
                        <parameter key="minimal_gain" value="0.02"/>
                        <parameter key="minimal_leaf_size" value="6"/>
                      </operator>
                      <connect from_port="training set" to_op="Stacking DT" to_port="training set"/>
                      <connect from_op="Stacking DT" from_port="model" to_port="model"/>
                      <portSpacing port="source_training set" spacing="0"/>
                      <portSpacing port="sink_model" spacing="0"/>
                    </process>
                  </operator>
                  <operator activated="true" class="adaboost" compatibility="7.5.001" expanded="true" height="82" name="Ada Boost KNN" width="90" x="406" y="325">
                    <parameter key="iterations" value="12"/>
                    <process expanded="true">
                      <operator activated="true" class="k_nn" compatibility="7.5.001" expanded="true" height="82" name="Stacking KNN" width="90" x="564" y="182">
                        <parameter key="k" value="5"/>
                        <parameter key="weighted_vote" value="true"/>
                      </operator>
                      <connect from_port="training set" to_op="Stacking KNN" to_port="training set"/>
                      <connect from_op="Stacking KNN" from_port="model" to_port="model"/>
                      <portSpacing port="source_training set" spacing="0"/>
                      <portSpacing port="sink_model" spacing="0"/>
                    </process>
                  </operator>
                  <operator activated="true" class="adaboost" compatibility="7.5.001" expanded="true" height="82" name="Ada Boost Log Regr" width="90" x="397" y="472">
                    <parameter key="iterations" value="12"/>
                    <process expanded="true">
                      <operator activated="true" class="h2o:logistic_regression" compatibility="7.5.000" expanded="true" height="103" name="Stacking Log Regr" width="90" x="544" y="139">
                        <parameter key="solver" value="L_BFGS"/>
                      </operator>
                      <connect from_port="training set" to_op="Stacking Log Regr" to_port="training set"/>
                      <connect from_op="Stacking Log Regr" from_port="model" to_port="model"/>
                      <portSpacing port="source_training set" spacing="0"/>
                      <portSpacing port="sink_model" spacing="0"/>
                    </process>
                  </operator>
                  <connect from_port="training set 1" to_op="Stacking NB" to_port="training set"/>
                  <connect from_port="training set 2" to_op="Ada Boost DT" to_port="training set"/>
                  <connect from_port="training set 3" to_op="Ada Boost KNN" to_port="training set"/>
                  <connect from_port="training set 4" to_op="Ada Boost Log Regr" to_port="training set"/>
                  <connect from_op="Stacking NB" from_port="model" to_port="base model 1"/>
                  <connect from_op="Ada Boost DT" from_port="model" to_port="base model 2"/>
                  <connect from_op="Ada Boost KNN" from_port="model" to_port="base model 3"/>
                  <connect from_op="Ada Boost Log Regr" from_port="model" to_port="base model 4"/>
                  <portSpacing port="source_training set 1" spacing="0"/>
                  <portSpacing port="source_training set 2" spacing="0"/>
                  <portSpacing port="source_training set 3" spacing="0"/>
                  <portSpacing port="source_training set 4" spacing="0"/>
                  <portSpacing port="source_training set 5" spacing="0"/>
                  <portSpacing port="sink_base model 1" spacing="0"/>
                  <portSpacing port="sink_base model 2" spacing="0"/>
                  <portSpacing port="sink_base model 3" spacing="0"/>
                  <portSpacing port="sink_base model 4" spacing="0"/>
                  <portSpacing port="sink_base model 5" spacing="0"/>
                </process>
                <process expanded="true">
                  <operator activated="false" class="performance_to_data" compatibility="7.5.001" expanded="true" height="82" name="Performance to Data" width="90" x="428" y="164"/>
                  <operator activated="false" class="remember" compatibility="7.5.001" expanded="true" height="68" name="Remember SVM Perf Vector" width="90" x="602" y="253">
                    <parameter key="name" value="SVM Perf Vector"/>
                    <parameter key="io_object" value="PerformanceVector"/>
                  </operator>
                  <operator activated="true" class="support_vector_machine" compatibility="7.5.001" expanded="true" height="124" name="SVM" width="90" x="221" y="50">
                    <parameter key="C" value="5.0"/>
                  </operator>
                  <connect from_port="stacking examples" to_op="SVM" to_port="training set"/>
                  <connect from_op="Performance to Data" from_port="performance vector" to_op="Remember SVM Perf Vector" to_port="store"/>
                  <connect from_op="SVM" from_port="model" to_port="stacking model"/>
                  <portSpacing port="source_stacking examples" spacing="0"/>
                  <portSpacing port="sink_stacking model" spacing="0"/>
                </process>
              </operator>
              <operator activated="true" class="group_models" compatibility="7.5.001" expanded="true" height="124" name="Group Models" width="90" x="653" y="123"/>
              <connect from_port="training set" to_op="z Score Normalize" to_port="example set input"/>
              <connect from_op="z Score Normalize" from_port="example set output" to_op="PCA 32 Attributes" to_port="example set input"/>
              <connect from_op="z Score Normalize" from_port="preprocessing model" to_op="Group Models" to_port="models in 1"/>
              <connect from_op="PCA 32 Attributes" from_port="example set output" to_op="Stacking" to_port="training set"/>
              <connect from_op="PCA 32 Attributes" from_port="preprocessing model" to_op="Group Models" to_port="models in 2"/>
              <connect from_op="Stacking" from_port="model" to_op="Group Models" to_port="models in 3"/>
              <connect from_op="Group Models" from_port="model out" to_port="model"/>
              <portSpacing port="source_training set" spacing="0"/>
              <portSpacing port="sink_model" spacing="0"/>
              <portSpacing port="sink_through 1" spacing="180"/>
            </process>
            <process expanded="true">
              <operator activated="true" class="apply_model" compatibility="7.5.001" expanded="true" height="82" name="Apply Stack Model" width="90" x="108" y="91">
                <list key="application_parameters"/>
              </operator>
              <operator activated="true" class="performance_binominal_classification" compatibility="7.5.001" expanded="true" height="82" name="Stacking Perf." width="90" x="546" y="34">
                <parameter key="AUC" value="true"/>
                <parameter key="precision" value="true"/>
                <parameter key="recall" value="true"/>
                <parameter key="false_positive" value="true"/>
                <parameter key="false_negative" value="true"/>
                <parameter key="true_positive" value="true"/>
                <parameter key="true_negative" value="true"/>
                <parameter key="sensitivity" value="true"/>
              </operator>
              <connect from_port="model" to_op="Apply Stack Model" to_port="model"/>
              <connect from_port="test set" to_op="Apply Stack Model" to_port="unlabelled data"/>
              <connect from_op="Apply Stack Model" from_port="labelled data" to_op="Stacking Perf." to_port="labelled data"/>
              <connect from_op="Stacking Perf." from_port="performance" to_port="performance 1"/>
              <connect from_op="Stacking Perf." from_port="example set" to_port="test set results"/>
              <portSpacing port="source_model" spacing="0"/>
              <portSpacing port="source_test set" spacing="0"/>
              <portSpacing port="source_through 1" spacing="0"/>
              <portSpacing port="sink_test set results" spacing="0"/>
              <portSpacing port="sink_performance 1" spacing="181"/>
              <portSpacing port="sink_performance 2" spacing="103"/>
              <description align="center" color="green" colored="true" height="57" resized="true" width="159" x="73" y="200">To Test Fold</description>
            </process>
          </operator>
          <operator activated="true" class="split_validation" compatibility="7.5.001" expanded="true" height="124" name="Validation" width="90" x="112" y="34">
            <parameter key="sampling_type" value="stratified sampling"/>
            <process expanded="true">
              <operator activated="true" class="normalize" compatibility="7.5.001" expanded="true" height="103" name="z Score Normalize (2)" width="90" x="147" y="158">
                <parameter key="attribute_filter_type" value="subset"/>
                <parameter key="attributes" value="age|creditamount|creditamout_per_month|duration"/>
              </operator>
              <operator activated="true" class="principal_component_analysis" compatibility="7.5.001" expanded="true" height="103" name="PCA 32 Attributes (2)" width="90" x="323" y="34">
                <parameter key="dimensionality_reduction" value="fixed number"/>
                <parameter key="variance_threshold" value="0.9"/>
                <parameter key="number_of_components" value="32"/>
              </operator>
              <operator activated="true" class="stacking" compatibility="7.5.001" expanded="true" height="68" name="Stacked Models with SVM" width="90" x="514" y="442">
                <process expanded="true">
                  <operator activated="true" class="naive_bayes" compatibility="7.5.001" expanded="true" height="82" name="Stacking NB (2)" width="90" x="414" y="170"/>
                  <operator activated="true" class="adaboost" compatibility="7.5.001" expanded="true" height="82" name="Ada Boost DT (2)" width="90" x="418" y="35">
                    <parameter key="iterations" value="12"/>
                    <process expanded="true">
                      <operator activated="true" class="concurrency:parallel_decision_tree" compatibility="7.5.001" expanded="true" height="82" name="Stacking DT (2)" width="90" x="707" y="75">
                        <parameter key="maximal_depth" value="15"/>
                        <parameter key="minimal_gain" value="0.02"/>
                        <parameter key="minimal_leaf_size" value="6"/>
                      </operator>
                      <connect from_port="training set" to_op="Stacking DT (2)" to_port="training set"/>
                      <connect from_op="Stacking DT (2)" from_port="model" to_port="model"/>
                      <portSpacing port="source_training set" spacing="0"/>
                      <portSpacing port="sink_model" spacing="0"/>
                    </process>
                  </operator>
                  <operator activated="true" class="adaboost" compatibility="7.5.001" expanded="true" height="82" name="Ada Boost KNN (2)" width="90" x="415" y="330">
                    <parameter key="iterations" value="12"/>
                    <process expanded="true">
                      <operator activated="true" class="k_nn" compatibility="7.5.001" expanded="true" height="82" name="Stacking KNN (2)" width="90" x="564" y="182">
                        <parameter key="k" value="20"/>
                        <parameter key="weighted_vote" value="true"/>
                      </operator>
                      <connect from_port="training set" to_op="Stacking KNN (2)" to_port="training set"/>
                      <connect from_op="Stacking KNN (2)" from_port="model" to_port="model"/>
                      <portSpacing port="source_training set" spacing="0"/>
                      <portSpacing port="sink_model" spacing="0"/>
                    </process>
                  </operator>
                  <operator activated="true" class="adaboost" compatibility="7.5.001" expanded="true" height="82" name="Ada Boost Log Regr (2)" width="90" x="407" y="488">
                    <parameter key="iterations" value="12"/>
                    <process expanded="true">
                      <operator activated="true" class="h2o:logistic_regression" compatibility="7.5.000" expanded="true" height="103" name="Stacking Log Regr (2)" width="90" x="544" y="139">
                        <parameter key="solver" value="L_BFGS"/>
                      </operator>
                      <connect from_port="training set" to_op="Stacking Log Regr (2)" to_port="training set"/>
                      <connect from_op="Stacking Log Regr (2)" from_port="model" to_port="model"/>
                      <portSpacing port="source_training set" spacing="0"/>
                      <portSpacing port="sink_model" spacing="0"/>
                    </process>
                  </operator>
                  <connect from_port="training set 1" to_op="Stacking NB (2)" to_port="training set"/>
                  <connect from_port="training set 2" to_op="Ada Boost DT (2)" to_port="training set"/>
                  <connect from_port="training set 3" to_op="Ada Boost KNN (2)" to_port="training set"/>
                  <connect from_port="training set 4" to_op="Ada Boost Log Regr (2)" to_port="training set"/>
                  <connect from_op="Stacking NB (2)" from_port="model" to_port="base model 1"/>
                  <connect from_op="Ada Boost DT (2)" from_port="model" to_port="base model 2"/>
                  <connect from_op="Ada Boost KNN (2)" from_port="model" to_port="base model 3"/>
                  <connect from_op="Ada Boost Log Regr (2)" from_port="model" to_port="base model 4"/>
                  <portSpacing port="source_training set 1" spacing="0"/>
                  <portSpacing port="source_training set 2" spacing="0"/>
                  <portSpacing port="source_training set 3" spacing="0"/>
                  <portSpacing port="source_training set 4" spacing="0"/>
                  <portSpacing port="source_training set 5" spacing="0"/>
                  <portSpacing port="sink_base model 1" spacing="0"/>
                  <portSpacing port="sink_base model 2" spacing="0"/>
                  <portSpacing port="sink_base model 3" spacing="0"/>
                  <portSpacing port="sink_base model 4" spacing="0"/>
                  <portSpacing port="sink_base model 5" spacing="0"/>
                </process>
                <process expanded="true">
                  <operator activated="true" class="nominal_to_numerical" compatibility="7.5.001" expanded="true" height="103" name="Nominal to Numerical" width="90" x="45" y="34">
                    <list key="comparison_groups"/>
                  </operator>
                  <operator activated="true" class="support_vector_machine" compatibility="7.5.001" expanded="true" height="124" name="SVM (2)" width="90" x="179" y="34">
                    <parameter key="C" value="5.0"/>
                  </operator>
                  <operator activated="true" class="group_models" compatibility="7.5.001" expanded="true" height="103" name="Group Models (3)" width="90" x="313" y="136"/>
                  <connect from_port="stacking examples" to_op="Nominal to Numerical" to_port="example set input"/>
                  <connect from_op="Nominal to Numerical" from_port="example set output" to_op="SVM (2)" to_port="training set"/>
                  <connect from_op="Nominal to Numerical" from_port="preprocessing model" to_op="Group Models (3)" to_port="models in 1"/>
                  <connect from_op="SVM (2)" from_port="model" to_op="Group Models (3)" to_port="models in 2"/>
                  <connect from_op="Group Models (3)" from_port="model out" to_port="stacking model"/>
                  <portSpacing port="source_stacking examples" spacing="0"/>
                  <portSpacing port="sink_stacking model" spacing="0"/>
                </process>
              </operator>
              <operator activated="true" class="group_models" compatibility="7.5.001" expanded="true" height="124" name="Group Models (2)" width="90" x="751" y="196"/>
              <connect from_port="training" to_op="z Score Normalize (2)" to_port="example set input"/>
              <connect from_op="z Score Normalize (2)" from_port="example set output" to_op="PCA 32 Attributes (2)" to_port="example set input"/>
              <connect from_op="z Score Normalize (2)" from_port="preprocessing model" to_op="Group Models (2)" to_port="models in 1"/>
              <connect from_op="PCA 32 Attributes (2)" from_port="example set output" to_op="Stacked Models with SVM" to_port="training set"/>
              <connect from_op="PCA 32 Attributes (2)" from_port="preprocessing model" to_op="Group Models (2)" to_port="models in 2"/>
              <connect from_op="Stacked Models with SVM" from_port="model" to_op="Group Models (2)" to_port="models in 3"/>
              <connect from_op="Group Models (2)" from_port="model out" to_port="model"/>
              <portSpacing port="source_training" spacing="0"/>
              <portSpacing port="sink_model" spacing="0"/>
              <portSpacing port="sink_through 1" spacing="0"/>
            </process>
            <process expanded="true">
              <operator activated="true" class="apply_model" compatibility="7.5.001" expanded="true" height="82" name="Apply Stack Model (2)" width="90" x="113" y="70">
                <list key="application_parameters"/>
              </operator>
              <operator activated="true" class="performance_binominal_classification" compatibility="7.5.001" expanded="true" height="82" name="Stacking Perf. (2)" width="90" x="452" y="73">
                <parameter key="AUC" value="true"/>
                <parameter key="precision" value="true"/>
                <parameter key="recall" value="true"/>
                <parameter key="false_positive" value="true"/>
                <parameter key="false_negative" value="true"/>
                <parameter key="true_positive" value="true"/>
                <parameter key="true_negative" value="true"/>
                <parameter key="sensitivity" value="true"/>
              </operator>
              <connect from_port="model" to_op="Apply Stack Model (2)" to_port="model"/>
              <connect from_port="test set" to_op="Apply Stack Model (2)" to_port="unlabelled data"/>
              <connect from_op="Apply Stack Model (2)" from_port="labelled data" to_op="Stacking Perf. (2)" to_port="labelled data"/>
              <connect from_op="Stacking Perf. (2)" from_port="performance" to_port="averagable 1"/>
              <portSpacing port="source_model" spacing="0"/>
              <portSpacing port="source_test set" spacing="0"/>
              <portSpacing port="source_through 1" spacing="0"/>
              <portSpacing port="sink_averagable 1" spacing="0"/>
              <portSpacing port="sink_averagable 2" spacing="0"/>
            </process>
          </operator>
          <operator activated="false" class="log" compatibility="7.5.001" expanded="true" height="82" name="Log Stacked Models" width="90" x="1174" y="475">
            <list key="log">
              <parameter key="Model Accuracy" value="operator.Cross Validation.value.performance 1"/>
              <parameter key="Model AUC" value="operator.Cross Validation.value.performance 2"/>
              <parameter key="Model Perf. 3" value="operator.Cross Validation.value.performance 3"/>
              <parameter key="Iteration Time" value="operator.Cross Validation.value.time"/>
              <parameter key="DL Activation" value="operator.Deep Learning.parameter.activation"/>
            </list>
          </operator>
          <operator activated="false" class="log_to_data" compatibility="7.5.001" expanded="true" height="103" name="Log to Data" width="90" x="1400" y="511">
            <parameter key="log_name" value="Log Stacked Models"/>
          </operator>
          <operator activated="false" class="write_excel" compatibility="7.5.001" expanded="true" height="82" name="Write Excel" width="90" x="1583" y="606">
            <parameter key="excel_file" value="C:\Data\Rapid_Miner_Training\Lab_Assignments\MM_Labl_Stacking_Process_Log.xlsx"/>
          </operator>
          <connect from_port="input 1" to_op="Validation" to_port="training"/>
          <connect from_op="Cross Validation" from_port="performance 1" to_op="Log Stacked Models" to_port="through 1"/>
          <connect from_op="Validation" from_port="model" to_port="result 1"/>
          <connect from_op="Validation" from_port="training" to_port="result 2"/>
          <connect from_op="Validation" from_port="averagable 1" to_port="performance"/>
          <connect from_op="Log Stacked Models" from_port="through 1" to_op="Log to Data" to_port="through 1"/>
          <connect from_op="Log to Data" from_port="exampleSet" to_op="Write Excel" to_port="input"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="source_input 2" spacing="0"/>
          <portSpacing port="sink_performance" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
          <portSpacing port="sink_result 3" spacing="0"/>
          <description align="center" color="green" colored="true" height="156" resized="false" width="180" x="940" y="529">Used Cross Validation first&lt;br&gt;Process runs to completion as long as a SVM is not one of the Models&lt;br/&gt;As RM documentation use case uses the split Validation Operatyor, I tried that as well.</description>
        </process>
      </operator>
      <connect from_op="Read Excel" from_port="output" to_op="Set Role" to_port="example set input"/>
      <connect from_op="Set Role" from_port="example set output" to_op="Nominal to Binominal" to_port="example set input"/>
      <connect from_op="Nominal to Binominal" from_port="example set output" to_op="Nom to Numeric" to_port="example set input"/>
      <connect from_op="Nom to Numeric" from_port="example set output" to_op="Extract Macro" to_port="example set"/>
      <connect from_op="Extract Macro" from_port="example set" to_op="Generate Weight (Stratification)" to_port="example set input"/>
      <connect from_op="Generate Weight (Stratification)" from_port="example set output" to_op="Optimize Parameters" to_port="input 1"/>
      <connect from_op="Optimize Parameters" from_port="performance" to_port="result 1"/>
      <connect from_op="Optimize Parameters" from_port="parameter" to_port="result 2"/>
      <connect from_op="Optimize Parameters" from_port="result 1" to_port="result 3"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <portSpacing port="sink_result 3" spacing="0"/>
      <portSpacing port="sink_result 4" spacing="19"/>
    </process>
  </operator>
</process>

~Martin

--------------------------------------------------------------------------
Head of Data Science Services at RapidMiner
RM Certified Analyst

Re: Using a SVM Within a Stacked Model...

Thanks, Martin.  The Process does work despite many Warnings (Aufrufezeichnen).  ;-)

I am able to write the Logging to Excel as long as I use the "Recall" Operator as the last step in the Process (see "Process Flow" screenshot).  Writing to Excel right after "Log ot Data" causes the Process to fail.

Intersting that you need to use a "Split Validation" if the Stacked Model includes a SVM.  This has implications regarding data selection for training.

Also Interesting how another conversion is required before the SVM sees the data on the "Leaner" side of the "Stacking" Operator. Can you explain why this is needed as I may need to try this in other situations?

I don't seem to be able to output the Predictions the model generates to a "res" port.  I am only able to seev training cases before the Model generates Predictions.  Is there anything you would suggest I try?

Best wishes, MfG, Michael

 

 

Attachments

Highlighted
RM Staff

Re: Using a SVM Within a Stacked Model...

Lieber Michael,

 

to explain the issue a bit. Stacking adds the predictions of the base learners to the attribute set of the stacked learner. Thus you had 4 additional nominal attributes - baseprediction0,baseprediction1.... A SVM can not handle this and fails.

The bug is that the stack trace is wrong..

 

For the others we need to have a closer look. It's friday evening for me, so I most likely won't have a look before monday morning

--------------------------------------------------------------------------
Head of Data Science Services at RapidMiner
RM Certified Analyst

Re: Using a SVM Within a Stacked Model...

Vielen Dank, Martin, fuer die Erklaerung - machts total Sinn.

Schoenes WE! ;-

MfG,

Michael

 

RM Certified Analyst

Re: Using a SVM Within a Stacked Model...

Hi Martin:

I took the "Optimize Parameters - Grid" Operator out of the Process you sent me (see below) and replaced the "Split Validation" operator from your process with the "Cross Validation" Operator - and now all works as expected.

So it seems that the problem I origianlly ran into has to do with using the "Optimize Parameters - Grid" operator when a "SVM" Operator is on the training or learning side of a "Stacking" Operator within a "Cross Validation" Operator.

As there are a number of parameters worth monitoring and logging in the "Staked Model", it would be interesting if it were somehow possible to use "Optimize Parameters" and a "Cross Validation" Operator (to ensure all data is used for training and testing).  

Your suggested Process makes it possible to use "Optimize Parameters" if you use a "Split Validation" Operator - my examply (below) allows for the "Cross Validation" Operator, but not "Optimize Parameters".  

Best wishes, Michael

 

<operator activated="true" class="process" compatibility="7.5.001" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="read_excel" compatibility="7.5.001" expanded="true" height="68" name="Read Excel" width="90" x="130" y="34">
<parameter key="excel_file" value="C:\Data\Rapid_Miner_Training\Lab_Assignments\Customer_Credit_Risk_Data.xlsx"/>
<parameter key="imported_cell_range" value="A1:V989"/>
<parameter key="first_row_as_names" value="false"/>
<list key="annotations">
<parameter key="0" value="Name"/>
</list>
<list key="data_set_meta_data_information">
<parameter key="0" value="foreignworker.true.polynominal.attribute"/>
<parameter key="1" value="status.true.polynominal.attribute"/>
<parameter key="2" value="credithistory.true.polynominal.attribute"/>
<parameter key="3" value="purpose.true.polynominal.attribute"/>
<parameter key="4" value="savings.true.polynominal.attribute"/>
<parameter key="5" value="employmentsince.true.polynominal.attribute"/>
<parameter key="6" value="otherdebtors.true.polynominal.attribute"/>
<parameter key="7" value="property.true.polynominal.attribute"/>
<parameter key="8" value="otherinstallments.true.polynominal.attribute"/>
<parameter key="9" value="housing.true.polynominal.attribute"/>
<parameter key="10" value="job.true.polynominal.attribute"/>
<parameter key="11" value="phone.true.polynominal.attribute"/>
<parameter key="12" value="duration.true.integer.attribute"/>
<parameter key="13" value="creditamount.true.integer.attribute"/>
<parameter key="14" value="installmentrate.true.integer.attribute"/>
<parameter key="15" value="residencesince.true.integer.attribute"/>
<parameter key="16" value="age.true.integer.attribute"/>
<parameter key="17" value="numberofexsistingcredits.true.integer.attribute"/>
<parameter key="18" value="numberofliablepeople.true.integer.attribute"/>
<parameter key="19" value="gender.true.polynominal.attribute"/>
<parameter key="20" value="creditworthy.true.polynominal.attribute"/>
<parameter key="21" value="creditamout_per_month.true.numeric.attribute"/>
</list>
</operator>
<operator activated="true" class="set_role" compatibility="7.5.001" expanded="true" height="82" name="Set Role" width="90" x="286" y="34">
<parameter key="attribute_name" value="creditworthy"/>
<parameter key="target_role" value="label"/>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="nominal_to_binominal" compatibility="7.5.001" expanded="true" height="103" name="Nominal to Binominal" width="90" x="483" y="36">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="creditworthy"/>
<parameter key="include_special_attributes" value="true"/>
</operator>
<operator activated="true" class="nominal_to_numerical" compatibility="7.5.001" expanded="true" height="103" name="Nom to Numeric" width="90" x="663" y="34">
<parameter key="attribute_filter_type" value="subset"/>
<parameter key="attributes" value="credithistory|employmentsince|foreignworker|gender|housing|job|otherdebtors|otherinstallments|phone|property|purpose|savings|status"/>
<list key="comparison_groups"/>
</operator>
<operator activated="true" class="extract_macro" compatibility="7.5.001" expanded="true" height="68" name="Extract Macro" width="90" x="835" y="48">
<parameter key="macro" value="num_Examples"/>
<list key="additional_macros"/>
</operator>
<operator activated="true" class="generate_weight_stratification" compatibility="7.5.001" expanded="true" height="82" name="Generate Weight (Stratification)" width="90" x="1012" y="42">
<parameter key="total_weight" value="%{num_Examples}"/>
</operator>
<operator activated="true" class="concurrency:cross_validation" compatibility="7.5.001" expanded="true" height="145" name="Cross Validation" width="90" x="1196" y="42">
<process expanded="true">
<operator activated="true" class="normalize" compatibility="7.5.001" expanded="true" height="103" name="z Score Normalize" width="90" x="121" y="186">
<parameter key="attribute_filter_type" value="subset"/>
<parameter key="attributes" value="age|creditamount|creditamout_per_month|duration"/>
</operator>
<operator activated="true" class="principal_component_analysis" compatibility="7.5.001" expanded="true" height="103" name="PCA 32 Attributes" width="90" x="261" y="34">
<parameter key="dimensionality_reduction" value="fixed number"/>
<parameter key="variance_threshold" value="0.9"/>
<parameter key="number_of_components" value="32"/>
</operator>
<operator activated="true" class="stacking" compatibility="7.5.001" expanded="true" height="68" name="Stacked Models with SVM" width="90" x="426" y="347">
<process expanded="true">
<operator activated="true" class="naive_bayes" compatibility="7.5.001" expanded="true" height="82" name="Stacking NB" width="90" x="382" y="177"/>
<operator activated="true" class="adaboost" compatibility="7.5.001" expanded="true" height="82" name="Ada Boost DT" width="90" x="379" y="36">
<parameter key="iterations" value="12"/>
<process expanded="true">
<operator activated="true" class="concurrencySmiley Tonguearallel_decision_tree" compatibility="7.5.001" expanded="true" height="82" name="Stacking DT" width="90" x="707" y="75">
<parameter key="maximal_depth" value="15"/>
<parameter key="minimal_gain" value="0.02"/>
<parameter key="minimal_leaf_size" value="6"/>
</operator>
<connect from_port="training set" to_op="Stacking DT" to_port="training set"/>
<connect from_op="Stacking DT" from_port="model" to_port="model"/>
<portSpacing port="source_training set" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
</process>
</operator>
<operator activated="true" class="adaboost" compatibility="7.5.001" expanded="true" height="82" name="Ada Boost KNN" width="90" x="375" y="307">
<parameter key="iterations" value="12"/>
<process expanded="true">
<operator activated="true" class="k_nn" compatibility="7.5.001" expanded="true" height="82" name="Stacking KNN" width="90" x="564" y="182">
<parameter key="k" value="10"/>
<parameter key="weighted_vote" value="true"/>
</operator>
<connect from_port="training set" to_op="Stacking KNN" to_port="training set"/>
<connect from_op="Stacking KNN" from_port="model" to_port="model"/>
<portSpacing port="source_training set" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
</process>
</operator>
<operator activated="true" class="adaboost" compatibility="7.5.001" expanded="true" height="82" name="Ada Boost Log Regr" width="90" x="366" y="457">
<parameter key="iterations" value="12"/>
<process expanded="true">
<operator activated="true" class="h2o:logistic_regression" compatibility="7.5.000" expanded="true" height="103" name="Stacking Log Regr" width="90" x="543" y="139"/>
<connect from_port="training set" to_op="Stacking Log Regr" to_port="training set"/>
<connect from_op="Stacking Log Regr" from_port="model" to_port="model"/>
<portSpacing port="source_training set" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
</process>
</operator>
<connect from_port="training set 1" to_op="Stacking NB" to_port="training set"/>
<connect from_port="training set 2" to_op="Ada Boost DT" to_port="training set"/>
<connect from_port="training set 3" to_op="Ada Boost KNN" to_port="training set"/>
<connect from_port="training set 4" to_op="Ada Boost Log Regr" to_port="training set"/>
<connect from_op="Stacking NB" from_port="model" to_port="base model 2"/>
<connect from_op="Ada Boost DT" from_port="model" to_port="base model 1"/>
<connect from_op="Ada Boost KNN" from_port="model" to_port="base model 3"/>
<connect from_op="Ada Boost Log Regr" from_port="model" to_port="base model 4"/>
<portSpacing port="source_training set 1" spacing="0"/>
<portSpacing port="source_training set 2" spacing="0"/>
<portSpacing port="source_training set 3" spacing="0"/>
<portSpacing port="source_training set 4" spacing="0"/>
<portSpacing port="source_training set 5" spacing="0"/>
<portSpacing port="sink_base model 1" spacing="0"/>
<portSpacing port="sink_base model 2" spacing="0"/>
<portSpacing port="sink_base model 3" spacing="0"/>
<portSpacing port="sink_base model 4" spacing="0"/>
<portSpacing port="sink_base model 5" spacing="0"/>
</process>
<process expanded="true">
<operator activated="true" class="nominal_to_numerical" compatibility="7.5.001" expanded="true" height="103" name="Nominal to Numerical" width="90" x="88" y="210">
<list key="comparison_groups"/>
</operator>
<operator activated="true" class="support_vector_machine" compatibility="7.5.001" expanded="true" height="124" name="SVM" width="90" x="288" y="66">
<parameter key="kernel_type" value="polynomial"/>
<parameter key="C" value="5.0"/>
</operator>
<operator activated="true" class="group_models" compatibility="7.5.001" expanded="true" height="103" name="Group Models Learning Side" width="90" x="524" y="239"/>
<connect from_port="stacking examples" to_op="Nominal to Numerical" to_port="example set input"/>
<connect from_op="Nominal to Numerical" from_port="example set output" to_op="SVM" to_port="training set"/>
<connect from_op="Nominal to Numerical" from_port="preprocessing model" to_op="Group Models Learning Side" to_port="models in 1"/>
<connect from_op="SVM" from_port="model" to_op="Group Models Learning Side" to_port="models in 2"/>
<connect from_op="Group Models Learning Side" from_port="model out" to_port="stacking model"/>
<portSpacing port="source_stacking examples" spacing="0"/>
<portSpacing port="sink_stacking model" spacing="0"/>
</process>
</operator>
<operator activated="true" class="group_models" compatibility="7.5.001" expanded="true" height="124" name="Group Models" width="90" x="624" y="145"/>
<connect from_port="training set" to_op="z Score Normalize" to_port="example set input"/>
<connect from_op="z Score Normalize" from_port="example set output" to_op="PCA 32 Attributes" to_port="example set input"/>
<connect from_op="z Score Normalize" from_port="preprocessing model" to_op="Group Models" to_port="models in 1"/>
<connect from_op="PCA 32 Attributes" from_port="example set output" to_op="Stacked Models with SVM" to_port="training set"/>
<connect from_op="PCA 32 Attributes" from_port="preprocessing model" to_op="Group Models" to_port="models in 2"/>
<connect from_op="Stacked Models with SVM" from_port="model" to_op="Group Models" to_port="models in 3"/>
<connect from_op="Group Models" from_port="model out" to_port="model"/>
<portSpacing port="source_training set" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
<portSpacing port="sink_through 1" spacing="0"/>
</process>
<process expanded="true">
<operator activated="true" class="apply_model" compatibility="7.5.001" expanded="true" height="82" name="Apply Stack Model" width="90" x="198" y="34">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="performance_binominal_classification" compatibility="7.5.001" expanded="true" height="82" name="Stacking Perf." width="90" x="427" y="72">
<parameter key="AUC" value="true"/>
<parameter key="precision" value="true"/>
<parameter key="recall" value="true"/>
<parameter key="false_positive" value="true"/>
<parameter key="false_negative" value="true"/>
<parameter key="true_positive" value="true"/>
<parameter key="true_negative" value="true"/>
<parameter key="sensitivity" value="true"/>
</operator>
<connect from_port="model" to_op="Apply Stack Model" to_port="model"/>
<connect from_port="test set" to_op="Apply Stack Model" to_port="unlabelled data"/>
<connect from_op="Apply Stack Model" from_port="labelled data" to_op="Stacking Perf." to_port="labelled data"/>
<connect from_op="Stacking Perf." from_port="performance" to_port="performance 1"/>
<connect from_op="Stacking Perf." from_port="example set" to_port="test set results"/>
<portSpacing port="source_model" spacing="0"/>
<portSpacing port="source_test set" spacing="0"/>
<portSpacing port="source_through 1" spacing="0"/>
<portSpacing port="sink_test set results" spacing="0"/>
<portSpacing port="sink_performance 1" spacing="0"/>
<portSpacing port="sink_performance 2" spacing="0"/>
</process>
</operator>
<connect from_op="Read Excel" from_port="output" to_op="Set Role" to_port="example set input"/>
<connect from_op="Set Role" from_port="example set output" to_op="Nominal to Binominal" to_port="example set input"/>
<connect from_op="Nominal to Binominal" from_port="example set output" to_op="Nom to Numeric" to_port="example set input"/>
<connect from_op="Nom to Numeric" from_port="example set output" to_op="Extract Macro" to_port="example set"/>
<connect from_op="Extract Macro" from_port="example set" to_op="Generate Weight (Stratification)" to_port="example set input"/>
<connect from_op="Generate Weight (Stratification)" from_port="example set output" to_op="Cross Validation" to_port="example set"/>
<connect from_op="Cross Validation" from_port="model" to_port="result 1"/>
<connect from_op="Cross Validation" from_port="test result set" to_port="result 2"/>
<connect from_op="Cross Validation" from_port="performance 1" to_port="result 3"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
<portSpacing port="sink_result 4" spacing="0"/>
</process>
</operator>
</process>

RM Staff

Re: Using a SVM Within a Stacked Model...

Dear Michael,

 

i've switched to X-Val, added performance and pass it out in the end. Seems to work fine?

 

Best,

Martin

 

<?xml version="1.0" encoding="UTF-8"?><process version="7.5.001">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="7.5.001" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="false" class="retrieve" compatibility="7.5.001" expanded="true" height="68" name="Retrieve Golf" width="90" x="112" y="238">
        <parameter key="repository_entry" value="//Samples/data/Golf"/>
      </operator>
      <operator activated="true" class="read_excel" compatibility="7.5.001" expanded="true" height="68" name="Read Excel" width="90" x="45" y="34">
        <parameter key="excel_file" value="C:\Users\Martin\Downloads\Customer_Credit_Risk_Data.xlsx"/>
        <parameter key="imported_cell_range" value="A1:V989"/>
        <parameter key="first_row_as_names" value="false"/>
        <list key="annotations">
          <parameter key="0" value="Name"/>
        </list>
        <list key="data_set_meta_data_information">
          <parameter key="0" value="foreignworker.true.polynominal.attribute"/>
          <parameter key="1" value="status.true.polynominal.attribute"/>
          <parameter key="2" value="credithistory.true.polynominal.attribute"/>
          <parameter key="3" value="purpose.true.polynominal.attribute"/>
          <parameter key="4" value="savings.true.polynominal.attribute"/>
          <parameter key="5" value="employmentsince.true.polynominal.attribute"/>
          <parameter key="6" value="otherdebtors.true.polynominal.attribute"/>
          <parameter key="7" value="property.true.polynominal.attribute"/>
          <parameter key="8" value="otherinstallments.true.polynominal.attribute"/>
          <parameter key="9" value="housing.true.polynominal.attribute"/>
          <parameter key="10" value="job.true.polynominal.attribute"/>
          <parameter key="11" value="phone.true.polynominal.attribute"/>
          <parameter key="12" value="duration.true.integer.attribute"/>
          <parameter key="13" value="creditamount.true.integer.attribute"/>
          <parameter key="14" value="installmentrate.true.integer.attribute"/>
          <parameter key="15" value="residencesince.true.integer.attribute"/>
          <parameter key="16" value="age.true.integer.attribute"/>
          <parameter key="17" value="numberofexsistingcredits.true.integer.attribute"/>
          <parameter key="18" value="numberofliablepeople.true.integer.attribute"/>
          <parameter key="19" value="gender.true.polynominal.attribute"/>
          <parameter key="20" value="creditworthy.true.polynominal.attribute"/>
          <parameter key="21" value="creditamout_per_month.true.numeric.attribute"/>
        </list>
      </operator>
      <operator activated="true" class="set_role" compatibility="7.5.001" expanded="true" height="82" name="Set Role" width="90" x="294" y="39">
        <parameter key="attribute_name" value="creditworthy"/>
        <parameter key="target_role" value="label"/>
        <list key="set_additional_roles"/>
      </operator>
      <operator activated="true" class="nominal_to_binominal" compatibility="7.5.001" expanded="true" height="103" name="Nominal to Binominal" width="90" x="476" y="34">
        <parameter key="attribute_filter_type" value="single"/>
        <parameter key="attribute" value="creditworthy"/>
        <parameter key="include_special_attributes" value="true"/>
      </operator>
      <operator activated="true" class="nominal_to_numerical" compatibility="7.5.001" expanded="true" height="103" name="Nom to Numeric" width="90" x="663" y="34">
        <parameter key="attribute_filter_type" value="subset"/>
        <parameter key="attributes" value="credithistory|employmentsince|foreignworker|gender|housing|job|otherdebtors|otherinstallments|phone|property|purpose|savings|status"/>
        <list key="comparison_groups"/>
      </operator>
      <operator activated="true" class="extract_macro" compatibility="7.5.001" expanded="true" height="68" name="Extract Macro" width="90" x="835" y="48">
        <parameter key="macro" value="num_Examples"/>
        <list key="additional_macros"/>
      </operator>
      <operator activated="true" class="generate_weight_stratification" compatibility="7.5.001" expanded="true" height="82" name="Generate Weight (Stratification)" width="90" x="985" y="45">
        <parameter key="total_weight" value="%{num_Examples}"/>
      </operator>
      <operator activated="true" class="optimize_parameters_grid" compatibility="7.5.001" expanded="true" height="124" name="Optimize Parameters" width="90" x="1184" y="34">
        <list key="parameters">
          <parameter key="SVM (3).C" value="[0.1;10;2;logarithmic]"/>
        </list>
        <parameter key="error_handling" value="ignore error"/>
        <process expanded="true">
          <operator activated="true" class="concurrency:cross_validation" compatibility="7.5.001" expanded="true" height="145" name="Cross Validation (2)" width="90" x="112" y="34">
            <process expanded="true">
              <operator activated="true" class="normalize" compatibility="7.5.001" expanded="true" height="103" name="z Score Normalize (3)" width="90" x="45" y="158">
                <parameter key="attribute_filter_type" value="subset"/>
                <parameter key="attributes" value="age|creditamount|creditamout_per_month|duration"/>
              </operator>
              <operator activated="true" class="principal_component_analysis" compatibility="7.5.001" expanded="true" height="103" name="PCA 32 Attributes (3)" width="90" x="221" y="34">
                <parameter key="dimensionality_reduction" value="fixed number"/>
                <parameter key="variance_threshold" value="0.9"/>
                <parameter key="number_of_components" value="32"/>
              </operator>
              <operator activated="true" class="stacking" compatibility="7.5.001" expanded="true" height="68" name="Stacked Models with SVM (2)" width="90" x="447" y="442">
                <process expanded="true">
                  <operator activated="true" class="naive_bayes" compatibility="7.5.001" expanded="true" height="82" name="Stacking NB (3)" width="90" x="414" y="170"/>
                  <operator activated="true" class="adaboost" compatibility="7.5.001" expanded="true" height="82" name="Ada Boost DT (3)" width="90" x="418" y="35">
                    <parameter key="iterations" value="12"/>
                    <process expanded="true">
                      <operator activated="true" class="concurrency:parallel_decision_tree" compatibility="7.5.001" expanded="true" height="82" name="Stacking DT (3)" width="90" x="707" y="75">
                        <parameter key="maximal_depth" value="15"/>
                        <parameter key="minimal_gain" value="0.02"/>
                        <parameter key="minimal_leaf_size" value="6"/>
                      </operator>
                      <connect from_port="training set" to_op="Stacking DT (3)" to_port="training set"/>
                      <connect from_op="Stacking DT (3)" from_port="model" to_port="model"/>
                      <portSpacing port="source_training set" spacing="0"/>
                      <portSpacing port="sink_model" spacing="0"/>
                    </process>
                  </operator>
                  <operator activated="true" class="adaboost" compatibility="7.5.001" expanded="true" height="82" name="Ada Boost KNN (3)" width="90" x="415" y="330">
                    <parameter key="iterations" value="12"/>
                    <process expanded="true">
                      <operator activated="true" class="k_nn" compatibility="7.5.001" expanded="true" height="82" name="Stacking KNN (3)" width="90" x="564" y="182">
                        <parameter key="k" value="20"/>
                        <parameter key="weighted_vote" value="true"/>
                      </operator>
                      <connect from_port="training set" to_op="Stacking KNN (3)" to_port="training set"/>
                      <connect from_op="Stacking KNN (3)" from_port="model" to_port="model"/>
                      <portSpacing port="source_training set" spacing="0"/>
                      <portSpacing port="sink_model" spacing="0"/>
                    </process>
                  </operator>
                  <operator activated="true" class="adaboost" compatibility="7.5.001" expanded="true" height="82" name="Ada Boost Log Regr (3)" width="90" x="407" y="488">
                    <parameter key="iterations" value="12"/>
                    <process expanded="true">
                      <operator activated="true" class="h2o:logistic_regression" compatibility="7.5.000" expanded="true" height="103" name="Stacking Log Regr (3)" width="90" x="544" y="139">
                        <parameter key="solver" value="L_BFGS"/>
                      </operator>
                      <connect from_port="training set" to_op="Stacking Log Regr (3)" to_port="training set"/>
                      <connect from_op="Stacking Log Regr (3)" from_port="model" to_port="model"/>
                      <portSpacing port="source_training set" spacing="0"/>
                      <portSpacing port="sink_model" spacing="0"/>
                    </process>
                  </operator>
                  <connect from_port="training set 1" to_op="Stacking NB (3)" to_port="training set"/>
                  <connect from_port="training set 2" to_op="Ada Boost DT (3)" to_port="training set"/>
                  <connect from_port="training set 3" to_op="Ada Boost KNN (3)" to_port="training set"/>
                  <connect from_port="training set 4" to_op="Ada Boost Log Regr (3)" to_port="training set"/>
                  <connect from_op="Stacking NB (3)" from_port="model" to_port="base model 1"/>
                  <connect from_op="Ada Boost DT (3)" from_port="model" to_port="base model 2"/>
                  <connect from_op="Ada Boost KNN (3)" from_port="model" to_port="base model 3"/>
                  <connect from_op="Ada Boost Log Regr (3)" from_port="model" to_port="base model 4"/>
                  <portSpacing port="source_training set 1" spacing="0"/>
                  <portSpacing port="source_training set 2" spacing="0"/>
                  <portSpacing port="source_training set 3" spacing="0"/>
                  <portSpacing port="source_training set 4" spacing="0"/>
                  <portSpacing port="source_training set 5" spacing="0"/>
                  <portSpacing port="sink_base model 1" spacing="0"/>
                  <portSpacing port="sink_base model 2" spacing="0"/>
                  <portSpacing port="sink_base model 3" spacing="0"/>
                  <portSpacing port="sink_base model 4" spacing="0"/>
                  <portSpacing port="sink_base model 5" spacing="0"/>
                </process>
                <process expanded="true">
                  <operator activated="true" class="nominal_to_numerical" compatibility="7.5.001" expanded="true" height="103" name="Nominal to Numerical (2)" width="90" x="45" y="34">
                    <list key="comparison_groups"/>
                  </operator>
                  <operator activated="true" class="support_vector_machine" compatibility="7.5.001" expanded="true" height="124" name="SVM (3)" width="90" x="179" y="34">
                    <parameter key="C" value="10.0"/>
                  </operator>
                  <operator activated="true" class="group_models" compatibility="7.5.001" expanded="true" height="103" name="Group Models (4)" width="90" x="313" y="136"/>
                  <connect from_port="stacking examples" to_op="Nominal to Numerical (2)" to_port="example set input"/>
                  <connect from_op="Nominal to Numerical (2)" from_port="example set output" to_op="SVM (3)" to_port="training set"/>
                  <connect from_op="Nominal to Numerical (2)" from_port="preprocessing model" to_op="Group Models (4)" to_port="models in 1"/>
                  <connect from_op="SVM (3)" from_port="model" to_op="Group Models (4)" to_port="models in 2"/>
                  <connect from_op="Group Models (4)" from_port="model out" to_port="stacking model"/>
                  <portSpacing port="source_stacking examples" spacing="0"/>
                  <portSpacing port="sink_stacking model" spacing="0"/>
                </process>
              </operator>
              <operator activated="true" class="group_models" compatibility="7.5.001" expanded="true" height="124" name="Group Models (5)" width="90" x="649" y="196"/>
              <connect from_port="training set" to_op="z Score Normalize (3)" to_port="example set input"/>
              <connect from_op="z Score Normalize (3)" from_port="example set output" to_op="PCA 32 Attributes (3)" to_port="example set input"/>
              <connect from_op="z Score Normalize (3)" from_port="preprocessing model" to_op="Group Models (5)" to_port="models in 1"/>
              <connect from_op="PCA 32 Attributes (3)" from_port="example set output" to_op="Stacked Models with SVM (2)" to_port="training set"/>
              <connect from_op="PCA 32 Attributes (3)" from_port="preprocessing model" to_op="Group Models (5)" to_port="models in 2"/>
              <connect from_op="Stacked Models with SVM (2)" from_port="model" to_op="Group Models (5)" to_port="models in 3"/>
              <connect from_op="Group Models (5)" from_port="model out" to_port="model"/>
              <portSpacing port="source_training set" spacing="0"/>
              <portSpacing port="sink_model" spacing="0"/>
              <portSpacing port="sink_through 1" spacing="0"/>
            </process>
            <process expanded="true">
              <operator activated="true" class="apply_model" compatibility="7.5.001" expanded="true" height="82" name="Apply Stack Model (3)" width="90" x="45" y="34">
                <list key="application_parameters"/>
              </operator>
              <operator activated="true" class="performance_binominal_classification" compatibility="7.5.001" expanded="true" height="82" name="Stacking Perf. (3)" width="90" x="384" y="37">
                <parameter key="AUC" value="true"/>
                <parameter key="precision" value="true"/>
                <parameter key="recall" value="true"/>
                <parameter key="false_positive" value="true"/>
                <parameter key="false_negative" value="true"/>
                <parameter key="true_positive" value="true"/>
                <parameter key="true_negative" value="true"/>
                <parameter key="sensitivity" value="true"/>
              </operator>
              <connect from_port="model" to_op="Apply Stack Model (3)" to_port="model"/>
              <connect from_port="test set" to_op="Apply Stack Model (3)" to_port="unlabelled data"/>
              <connect from_op="Apply Stack Model (3)" from_port="labelled data" to_op="Stacking Perf. (3)" to_port="labelled data"/>
              <connect from_op="Stacking Perf. (3)" from_port="performance" to_port="performance 1"/>
              <portSpacing port="source_model" spacing="0"/>
              <portSpacing port="source_test set" spacing="0"/>
              <portSpacing port="source_through 1" spacing="0"/>
              <portSpacing port="sink_test set results" spacing="0"/>
              <portSpacing port="sink_performance 1" spacing="0"/>
              <portSpacing port="sink_performance 2" spacing="0"/>
            </process>
          </operator>
          <operator activated="true" class="log" compatibility="7.5.001" expanded="true" height="82" name="Log" width="90" x="246" y="136">
            <list key="log">
              <parameter key="perf" value="operator.Cross Validation (2).value.performance main criterion"/>
              <parameter key="SVM_C" value="operator.SVM (3).parameter.C"/>
            </list>
          </operator>
          <connect from_port="input 1" to_op="Cross Validation (2)" to_port="example set"/>
          <connect from_op="Cross Validation (2)" from_port="performance 1" to_op="Log" to_port="through 1"/>
          <connect from_op="Log" from_port="through 1" to_port="performance"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="source_input 2" spacing="0"/>
          <portSpacing port="sink_performance" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="log_to_data" compatibility="7.5.001" expanded="true" height="103" name="Log to Data" width="90" x="1385" y="238">
        <parameter key="log_name" value="Log"/>
      </operator>
      <connect from_op="Read Excel" from_port="output" to_op="Set Role" to_port="example set input"/>
      <connect from_op="Set Role" from_port="example set output" to_op="Nominal to Binominal" to_port="example set input"/>
      <connect from_op="Nominal to Binominal" from_port="example set output" to_op="Nom to Numeric" to_port="example set input"/>
      <connect from_op="Nom to Numeric" from_port="example set output" to_op="Extract Macro" to_port="example set"/>
      <connect from_op="Extract Macro" from_port="example set" to_op="Generate Weight (Stratification)" to_port="example set input"/>
      <connect from_op="Generate Weight (Stratification)" from_port="example set output" to_op="Optimize Parameters" to_port="input 1"/>
      <connect from_op="Optimize Parameters" from_port="performance" to_port="result 1"/>
      <connect from_op="Optimize Parameters" from_port="parameter" to_port="result 2"/>
      <connect from_op="Optimize Parameters" from_port="result 1" to_op="Log to Data" to_port="through 1"/>
      <connect from_op="Log to Data" from_port="exampleSet" to_port="result 3"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <portSpacing port="sink_result 3" spacing="0"/>
      <portSpacing port="sink_result 4" spacing="0"/>
    </process>
  </operator>
</process>
--------------------------------------------------------------------------
Head of Data Science Services at RapidMiner
RM Certified Analyst
Solution
Accepted by topic author M_Martin
‎06-19-2017 10:46 AM

Re: Using a SVM Within a Stacked Model...

Hallo Martin:

You're absolutely right - I can only guess that I must have made an error having to do with Grouping Models correctly.  Plus I needed to apply your tip re: converting the prediction Nominals on the "Base Learners" side of the Stacked Model to Numericals prior to feeding everything through to the SVM on the "Learner" side of the Stacked Model.

I reconstructed everything from the start, making sure to Group Models very carefully and apply the above mentioned tip from you, and all works as expected. ;-)

My sincere thanks for your patience and advice, very much appreciated.  

As far as I'm concerned, it looks like we can close this issue.

Attached is the test version I just put together and tested, which works fine.  

Alles gute - MfG, Michael

Attachments