Write best model to disk

chris_ml · October 2008

Hi,

I would like to compare the performance of two different learners with
T-Test and Anova and write finally the better model to disk in order to use
it later.

This is my model for the performance evaluation:


T-Test and Anova

<operator name="Root" class="Process" expanded="yes">
    <description text="#ylt#p#ygt#Many RapidMiner operators can be used to estimate the performance of a learner, a preprocessing step, or a feature space on one or several data sets. The result of these validation operators is a performance vector collecting the values of a set of performance criteria. For each criterion, the mean value and standard deviation are given. #ylt#/p#ygt#  #ylt#p#ygt#The question is how these performance vectors can be compared? Statistical significance tests like ANOVA or pairwise t-tests can be used to calculate the probability that the actual mean values are different. #ylt#/p#ygt# #ylt#p#ygt# We assume that you have achieved several performance vectors and want to compare them. In this experiment we use the same data set for both cross validations (hence the IOMultiplier) and estimate the performance of a linear learning scheme and a RBF based SVM. #ylt#/p#ygt# #ylt#p#ygt# Run the experiment and compare the results: the probabilities for a significant difference are equal since only two performance vectors were created. In this case the SVM is probably better suited for the data set at hand since the actual mean values are probably different.#ylt#/p#ygt##ylt#p#ygt#Please note that performance vectors like all other objects which can be passed between RapidMiner operators can be written into and loaded from a file.#ylt#/p#ygt#"/>
    <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
        <parameter key="attributes_lower_bound"	value="-40.0"/>
        <parameter key="attributes_upper_bound"	value="30.0"/>
        <parameter key="number_examples"	value="80"/>
        <parameter key="number_of_attributes"	value="1"/>
        <parameter key="target_function"	value="one variable non linear"/>
    </operator>
    <operator name="IOMultiplier" class="IOMultiplier">
        <parameter key="io_object"	value="ExampleSet"/>
    </operator>
    <operator name="XValidation" class="XValidation" expanded="yes">
        <parameter key="sampling_type"	value="shuffled sampling"/>
        <operator name="LibSVMLearner" class="LibSVMLearner">
            <parameter key="C"	value="10000.0"/>
            <list key="class_weights">
            </list>
            <parameter key="svm_type"	value="nu-SVR"/>
        </operator>
        <operator name="OperatorChain" class="OperatorChain" expanded="yes">
            <operator name="ModelApplier" class="ModelApplier">
                <list key="application_parameters">
                </list>
            </operator>
            <operator name="RegressionPerformance" class="RegressionPerformance">
                <parameter key="absolute_error"	value="true"/>
            </operator>
        </operator>
    </operator>
    <operator name="XValidation (2)" class="XValidation" expanded="yes">
        <parameter key="sampling_type"	value="shuffled sampling"/>
        <operator name="LinearRegression" class="LinearRegression">
        </operator>
        <operator name="OperatorChain (2)" class="OperatorChain" expanded="yes">
            <operator name="ModelApplier (2)" class="ModelApplier">
                <list key="application_parameters">
                </list>
            </operator>
            <operator name="RegressionPerformance (2)" class="RegressionPerformance">
                <parameter key="absolute_error"	value="true"/>
            </operator>
        </operator>
    </operator>
    <operator name="T-Test" class="T-Test">
    </operator>
    <operator name="Anova" class="Anova">
    </operator>
</operator>

I have no idea how to retrieve the better learner after evaluation with T-Test and
Anova. I assume that both models must be temporarily stored before the T-Test
operator and than the better learner is finally stored to disk depending on the
PerformanceVector of Anova, right? But I have no idea how to do that.
And ideas?

Regards,
Chris

land · October 2008

Hi Chris,
as far as I know ANOVA only calculates the probability that the models are not the same. How do you want to select the better model with that?
Instead you could use the original performance vectors to choose the better model. If you want to compare different learner, try it with something like that:


<operator name="Root" class="Process" expanded="yes">
    <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
        <parameter key="attributes_lower_bound"	value="-40.0"/>
        <parameter key="attributes_upper_bound"	value="30.0"/>
        <parameter key="number_examples"	value="80"/>
        <parameter key="number_of_attributes"	value="1"/>
        <parameter key="target_function"	value="one variable non linear"/>
    </operator>
    <operator name="GridParameterOptimization" class="GridParameterOptimization" expanded="yes">
        <list key="parameters">
          <parameter key="OperatorSelector_train.select_which"	value="[1.0;2.0;10;linear]"/>
        </list>
        <operator name="XValidation" class="XValidation" expanded="yes">
            <parameter key="average_performances_only"	value="false"/>
            <parameter key="keep_example_set"	value="true"/>
            <parameter key="sampling_type"	value="shuffled sampling"/>
            <operator name="OperatorSelector_train" class="OperatorSelector" expanded="yes">
                <parameter key="select_which"	value="2"/>
                <operator name="LibSVMLearner" class="LibSVMLearner">
                    <parameter key="C"	value="10000.0"/>
                    <list key="class_weights">
                    </list>
                    <parameter key="svm_type"	value="nu-SVR"/>
                </operator>
                <operator name="LinearRegression" class="LinearRegression">
                </operator>
            </operator>
            <operator name="OperatorChain" class="OperatorChain" expanded="yes">
                <operator name="ModelApplier" class="ModelApplier">
                    <list key="application_parameters">
                    </list>
                </operator>
                <operator name="RegressionPerformance" class="RegressionPerformance">
                    <parameter key="absolute_error"	value="true"/>
                </operator>
            </operator>
        </operator>
    </operator>
    <operator name="ParameterSetter" class="ParameterSetter">
        <list key="name_map">
          <parameter key="OperatorSelector_train"	value="OperatorSelector_apply"/>
        </list>
    </operator>
    <operator name="OperatorSelector_apply" class="OperatorSelector" expanded="yes">
        <operator name="LibSVMLearner (2)" class="LibSVMLearner">
            <parameter key="C"	value="10000.0"/>
            <list key="class_weights">
            </list>
            <parameter key="svm_type"	value="nu-SVR"/>
        </operator>
        <operator name="LinearRegression (2)" class="LinearRegression">
        </operator>
    </operator>
</operator>

Greetings,
Sebastian

chris_ml · October 2008

Hey Sebastian,

the model you proposed is what I was looking for. :-)

However, I was trying to extend it but could not find a working solution.
So, what I wanted was to replace the simple learners with their default parameters
in OperatorSelector_train by a GridParameterOptimization, i.e. for two different
learners I want to optimize their parameters and finally return the most
accurate learner that can be later used.

That first problem was that I was not able to add a GridParameterOptimization
in your model:


<operator name="Root" class="Process" expanded="yes">
    <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
        <parameter key="attributes_lower_bound"	value="-40.0"/>
        <parameter key="attributes_upper_bound"	value="30.0"/>
        <parameter key="number_examples"	value="80"/>
        <parameter key="number_of_attributes"	value="1"/>
        <parameter key="target_function"	value="one variable non linear"/>
    </operator>
    <operator name="GridParameterOptimization" class="GridParameterOptimization" expanded="yes">
        <list key="parameters">
          <parameter key="OperatorSelector_train.select_which"	value="[1.0;2.0;10;linear]"/>
        </list>
        <operator name="XValidation" class="XValidation" expanded="yes">
            <parameter key="average_performances_only"	value="false"/>
            <parameter key="keep_example_set"	value="true"/>
            <parameter key="sampling_type"	value="shuffled sampling"/>
            <operator name="OperatorSelector_train" class="OperatorSelector" expanded="yes">
                <parameter key="select_which"	value="2"/>
                <operator name="GridParameterOptimization (2)" class="GridParameterOptimization" expanded="yes">
                    <list key="parameters">
                      <parameter key="LibSVMLearner.svm_type"	value="C-SVC,nu-SVC,one-class,epsilon-SVR,nu-SVR"/>
                      <parameter key="LibSVMLearner.degree"	value="[1.0;1000.0;10;linear]"/>
                    </list>
                    <operator name="LibSVMLearner" class="LibSVMLearner">
                        <parameter key="C"	value="10000.0"/>
                        <list key="class_weights">
                        </list>
                        <parameter key="svm_type"	value="nu-SVR"/>
                    </operator>
                </operator>
                <operator name="GridParameterOptimization (3)" class="GridParameterOptimization" expanded="yes">
                    <list key="parameters">
                      <parameter key="LinearRegression.feature_selection"	value="none,M5 prime,greedy"/>
                      <parameter key="LinearRegression.ridge"	value="[0.0;1000.0;10;linear]"/>
                    </list>
                    <operator name="LinearRegression" class="LinearRegression">
                    </operator>
                </operator>
            </operator>
            <operator name="OperatorChain" class="OperatorChain" expanded="yes">
                <operator name="ModelApplier" class="ModelApplier">
                    <list key="application_parameters">
                    </list>
                </operator>
                <operator name="RegressionPerformance" class="RegressionPerformance">
                    <parameter key="absolute_error"	value="true"/>
                </operator>
            </operator>
        </operator>
    </operator>
    <operator name="ParameterSetter" class="ParameterSetter">
        <list key="name_map">
          <parameter key="OperatorSelector_train"	value="OperatorSelector_apply"/>
        </list>
    </operator>
    <operator name="OperatorSelector_apply" class="OperatorSelector" expanded="yes">
        <operator name="LibSVMLearner (2)" class="LibSVMLearner">
            <parameter key="C"	value="10000.0"/>
            <list key="class_weights">
            </list>
            <parameter key="svm_type"	value="nu-SVR"/>
        </operator>
        <operator name="LinearRegression (2)" class="LinearRegression">
        </operator>
    </operator>
</operator>

And the second question that comes into my mind is how to
propagate the best parameters. In your current model, you just
pass one of the two models to OperatorSelector_apply. But
with my parameter optimization, two types of information must
be passed. 1) The best learner (as currently done) and the corresponding
parameter set for that learner. However, how can this be achieved
when to chose among different learners?

I would appreciate again your help. :-)

Regards,
Chris

chris_ml · November 2008

Hi guys,

I'm still stuck with this problem and can't continue my evaluations. :-\
Please help me out.

Thanks a lot.

Chris

steffen · November 2008

Hello Chris

Your posted setup does not work because the requirements of the corresponding operators are not met. If you select an operator and type "F1" a window will open where you can find required input, delivered output and requirements for inner operators (if the selected operator is some kind of OperatorChain).

GridParameterOptimization requires, that its child operators deliver a performance vector. To produce a performance operator you must create standard train-model-and-apply-it schemes. Something like this:

<operator name="Root" class="Process" expanded="yes">
    <operator name="GridParameterOptimization" class="GridParameterOptimization" expanded="no">
        <list key="parameters">
          <parameter key="OperatorSelector_train.select_which"	value="[1.0;2.0;10;linear]"/>
        </list>
        <operator name="XValidation" class="XValidation" expanded="no">
            <parameter key="average_performances_only"	value="false"/>
            <parameter key="keep_example_set"	value="true"/>
            <parameter key="sampling_type"	value="shuffled sampling"/>
            <operator name="train" class="OperatorChain" expanded="yes">
                <operator name="LinearRegression" class="LinearRegression">
                </operator>
            </operator>
            <operator name="apply" class="OperatorChain" expanded="yes">
                <operator name="ModelApplier" class="ModelApplier">
                    <list key="application_parameters">
                    </list>
                </operator>
                <operator name="RegressionPerformance" class="RegressionPerformance">
                    <parameter key="absolute_error"	value="true"/>
                </operator>
            </operator>
        </operator>
    </operator>
</operator>

Got it ? Fine, now lets deal with your problems in detail:

I suggest a process like this:


<operator name="Root" class="Process" expanded="yes">
    <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
        <parameter key="attributes_lower_bound"	value="-40.0"/>
        <parameter key="attributes_upper_bound"	value="30.0"/>
        <parameter key="number_examples"	value="80"/>
        <parameter key="number_of_attributes"	value="1"/>
        <parameter key="target_function"	value="one variable non linear"/>
    </operator>
    <operator name="opt_linreg" class="GridParameterOptimization" expanded="no">
        <list key="parameters">
          <parameter key="LinearRegression.feature_selection"	value="none,M5 prime,greedy"/>
          <parameter key="LinearRegression.ridge"	value="[0.0;1000.0;10;linear]"/>
        </list>
        <operator name="XValidation" class="XValidation" expanded="no">
            <parameter key="average_performances_only"	value="false"/>
            <parameter key="keep_example_set"	value="true"/>
            <parameter key="sampling_type"	value="shuffled sampling"/>
            <operator name="LinearRegression" class="LinearRegression">
                <parameter key="feature_selection"	value="greedy"/>
                <parameter key="ridge"	value="1000.0"/>
            </operator>
            <operator name="OperatorChain" class="OperatorChain" expanded="no">
                <operator name="ModelApplier" class="ModelApplier">
                    <list key="application_parameters">
                    </list>
                </operator>
                <operator name="RegressionPerformance" class="RegressionPerformance">
                    <parameter key="absolute_error"	value="true"/>
                </operator>
            </operator>
        </operator>
    </operator>
    <operator name="PerformanceWriter" class="PerformanceWriter">
        <parameter key="performance_file"	value="C:\linreg.per"/>
    </operator>
    <operator name="opt_libsvm" class="GridParameterOptimization" expanded="no">
        <list key="parameters">
          <parameter key="LibSVMLearner.svm_type"	value="epsilon-SVR,nu-SVR"/>
          <parameter key="LibSVMLearner.degree"	value="[1.0;1000.0;10;linear]"/>
        </list>
        <operator name="XValidation (2)" class="XValidation" expanded="yes">
            <parameter key="average_performances_only"	value="false"/>
            <parameter key="keep_example_set"	value="true"/>
            <parameter key="sampling_type"	value="shuffled sampling"/>
            <operator name="LibSVMLearner" class="LibSVMLearner">
                <parameter key="C"	value="10000.0"/>
                <list key="class_weights">
                </list>
                <parameter key="degree"	value="301"/>
                <parameter key="svm_type"	value="nu-SVR"/>
            </operator>
            <operator name="OperatorChain (2)" class="OperatorChain" expanded="yes">
                <operator name="ModelApplier (2)" class="ModelApplier">
                    <list key="application_parameters">
                    </list>
                </operator>
                <operator name="RegressionPerformance (2)" class="RegressionPerformance">
                    <parameter key="absolute_error"	value="true"/>
                </operator>
            </operator>
        </operator>
    </operator>
    <operator name="PerformanceWriter (2)" class="PerformanceWriter">
        <parameter key="performance_file"	value="C:\libsvm.per"/>
    </operator>
    <operator name="kill_all_performance_measures" class="IOConsumer">
        <parameter key="io_object"	value="PerformanceVector"/>
    </operator>
    <operator name="select_model" class="GridParameterOptimization" expanded="no">
        <list key="parameters">
          <parameter key="OperatorSelector_train.select_which"	value="[1.0;2.0;1;linear]"/>
        </list>
        <operator name="OperatorSelector_train" class="OperatorSelector" expanded="yes">
            <parameter key="select_which"	value="2"/>
            <operator name="PerformanceLoader" class="PerformanceLoader">
                <parameter key="performance_file"	value="C:\linreg.per"/>
            </operator>
            <operator name="PerformanceLoader (2)" class="PerformanceLoader">
                <parameter key="performance_file"	value="C:\libsvm.per"/>
            </operator>
        </operator>
    </operator>
    <operator name="ParameterSetter" class="ParameterSetter">
        <list key="name_map">
          <parameter key="OperatorSelector_train"	value="OperatorSelector_apply"/>
        </list>
    </operator>
    <operator name="OperatorSelector_apply" class="OperatorSelector" expanded="no">
        <parameter key="select_which"	value="2"/>
        <operator name="OperatorChain (3)" class="OperatorChain" expanded="no">
            <operator name="linregset" class="ParameterSetLoader">
                <parameter key="parameter_file"	value="C:\linreg.par"/>
            </operator>
            <operator name="ParameterSetter (2)" class="ParameterSetter">
                <list key="name_map">
                  <parameter key="opt_linreg"	value="apply_linreg"/>
                </list>
            </operator>
            <operator name="apply_linreg" class="LinearRegression">
            </operator>
        </operator>
        <operator name="OperatorChain (4)" class="OperatorChain" expanded="no">
            <operator name="ParameterSetter (3)" class="ParameterSetter">
                <list key="name_map">
                  <parameter key="opt_libsvm"	value="apply_libsvm"/>
                </list>
            </operator>
            <operator name="apply_libsvm" class="LibSVMLearner">
                <parameter key="C"	value="10000.0"/>
                <list key="class_weights">
                </list>
                <parameter key="svm_type"	value="nu-SVR"/>
            </operator>
        </operator>
    </operator>
    <operator name="IOConsumer" class="IOConsumer">
        <parameter key="io_object"	value="ParameterSet"/>
    </operator>
    <operator name="IOConsumer (2)" class="IOConsumer">
        <parameter key="io_object"	value="PerformanceVector"/>
    </operator>
</operator>

The drawback of this setup is that the performance vectors have to be saved to disc. But I hope this is not too severe. Two reasons:

As far as I see, it is currently not possible to move the parameter set beyond the model-selection-optimization-operator (within the process)
More important: You should take a look at the difference in the performance measure. Why ? Imagine two kind of models, one simple and fast, the other one slow and complicated. Automatic selection determine that the second one is better. But looking at the performance value one recognizes that the second one is 10^-4 bigger than the other one. Does this difference justify a much more complicated model ?

One may ask: Why not nest the optimization as you have tried in your last post ? The main reason is that you are not able to get a usable parameter set. But:
If your data set is big enough you can use a nested optimization to show that the grid optimization in combination with the selected learning algorithmn is not going to hurt the generalization power of the model.
The suggested process above is also able to do this, but (in my opinion) it is less powerful in judging the generalization power. One the other hand, the process above is capable of producing a parameter set for the given dataset. So let's use it.

greetings

Steffen

PS: @RapidMiner Guys: If you remove the last two consumer operators some strange resultobjects pop out of nowhere (or their names are somehow twisted). Setting a breakpoint after OperatorChain (4) shows that everything is allright. Since I am considering OperatorSelection as a simple extension of operatorchain, I wonder where the "other" resultobjects are "generated".

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

Write best model to disk

Answers