Grid Parameter Optimization

lexusboylexusboy Member Posts: 22 Maven
Hello,

I am a little confused with the behavior of the Grid Parameter Optimization operator, which i as i understand performs a grid search on a list of parameters for a particular machine learning algorithm (e.g. LibSVM), on a particular set of data,  and gives you the best (or optimal) parameter for that data.

When i ran my tests I also used the ProcessLog operator to track the values as the tests went on. Below is the output from one such test, of the process Log & the grid parameter operator. From the output you can clearly see that the C parameter values 128 & 0.5 yield the best results, but strangely the grid parameter gives "64" as the optimal parameter value. And this is just one example i have many more test outputs which give similar results, could somebody please explain this. Thanks!

1)
# Generated by ProcessLog[com.rapidminer.operator.visualization.ProcessLogOperator]
# ClassificationAccuracy C Parameter
NaN                                         0.03125
0.4875                                 0.0625
0.5                                         0.125
0.5                                         0.25
0.7125                                 0.5
0.65                                         1.0
0.5875                                 2.0
0.6                                         4.0
0.6625                                 8.0
0.6375                               16.0
0.6625                               32.0
0.625                               64.0
0.725                             128.0
0.6625                             256.0
0.625                             512.0
0.6625                           1024.0
0.6875                           2048.0
0.6375                           4096.0
0.6375                           8192.0
0.6625                         16384.0
0.6875                         32768.0

2)
<?xml version="1.0" encoding="windows-1252"?>
<parameterset version="4.6">
    <parameter operator="LibSVMLearner" key="C" value="64"/>
</parameterset>

Answers

  • haddockhaddock Member Posts: 849 Maven
    Hi there,

    Sounds interesting, perhaps you could post the XML for the process?
  • lexusboylexusboy Member Posts: 22 Maven
    Hi Haddock,

    Here is the XML of the process:

    <operator name="Root" class="Process" expanded="yes">
        <operator name="ExampleSource" class="ExampleSource">
            <parameter key="attributes" value="C:\Documents and Settings\Lexusboy\My Documents\RapidMiner\Supervised\test_2_svm_negative_out_200_tf_idf.aml"/>
        </operator>
        <operator name="GridParameterOptimization" class="GridParameterOptimization" expanded="yes">
            <list key="parameters">
              <parameter key="LibSVMLearner.C" value="0.03125,0.0625,0.125,0.25,0.5,1,2,4,8,16,32,64,128,256,512,1024,2048,4096,8192,16384,32768"/>       
            </list>
            <operator name="ProcessLog" class="ProcessLog">
                <parameter key="filename" value="C:\Documents and Settings\Lexusboy\My Documents\RapidMiner\Supervised\processLog.log"/>
                <list key="log">
                  <parameter key="ClassificationAccuracy" value="operator.ClassificationPerformance.value.accuracy"/>
                  <parameter key="C Parameter" value="operator.LibSVMLearner.parameter.C"/>           
                </list>
                <parameter key="persistent" value="true"/>
            </operator>
            <operator name="XValidation" class="XValidation" expanded="yes">
                <parameter key="number_of_validations" value="5"/>
                <operator name="LibSVMLearner" class="LibSVMLearner">           
                    <parameter key="C" value="32768"/>
                    <list key="class_weights">
                    </list>
                </operator>
                <operator name="OperatorChain" class="OperatorChain" expanded="yes">
                    <operator name="ModelApplier" class="ModelApplier">
                        <list key="application_parameters">
                        </list>
                    </operator>
                    <operator name="ClassificationPerformance" class="ClassificationPerformance">
                        <parameter key="main_criterion" value="accuracy"/>
                        <parameter key="accuracy" value="true"/>
                        <parameter key="classification_error" value="true"/>
                        <parameter key="weighted_mean_recall" value="true"/>
                        <parameter key="weighted_mean_precision" value="true"/>
                        <parameter key="absolute_error" value="true"/>
                        <parameter key="relative_error" value="true"/>
                        <list key="class_weights">
                        </list>
                    </operator>
                </operator>
            </operator>
        </operator>
        <operator name="ParameterSetWriter" class="ParameterSetWriter">
            <parameter key="parameter_file" value="C:\Documents and Settings\Lexusboy\My Documents\RapidMiner\Supervised\gridParameters.par"/>
        </operator>   
    </operator>


    Thanks for looking into this :)
  • haddockhaddock Member Posts: 849 Maven
    Hola lexusboy,

    I think you just got the log in the wrong place, it needs to come after the validation rather than before it. That is why your earlier version did not provide a performance figure for the first pass...

    # ClassificationAccuracy  C Parameter
    NaN                                            0.03125
    To save you the bother, here it is..
    <operator name="Root" class="Process" expanded="yes">
        <operator name="ExampleSource" class="ExampleSource" activated="no">
            <parameter key="attributes" value="C:\Documents and Settings\Lexusboy\My Documents\RapidMiner\Supervised\test_2_svm_negative_out_200_tf_idf.aml"/>
        </operator>
        <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
            <parameter key="target_function" value="random"/>
        </operator>
        <operator name="AttributeSubsetPreprocessing" class="AttributeSubsetPreprocessing" expanded="yes">
            <parameter key="condition_class" value="attribute_name_filter"/>
            <parameter key="parameter_string" value="label"/>
            <parameter key="attribute_name_regex" value="label"/>
            <parameter key="process_special_attributes" value="true"/>
            <operator name="BinDiscretization" class="BinDiscretization">
                <parameter key="range_name_type" value="short"/>
            </operator>
        </operator>
        <operator name="GridParameterOptimization" class="GridParameterOptimization" expanded="yes">
            <list key="parameters">
              <parameter key="LibSVMLearner.C" value="0.03125,0.0625,0.125,0.25,0.5,1,2,4,8,16,32,64,128,256,512,1024,2048,4096,8192,16384,32768"/>
            </list>
            <operator name="XValidation" class="XValidation" expanded="no">
                <parameter key="number_of_validations" value="5"/>
                <operator name="LibSVMLearner" class="LibSVMLearner">
                    <parameter key="C" value="32768"/>
                    <list key="class_weights">
                    </list>
                </operator>
                <operator name="OperatorChain" class="OperatorChain" expanded="yes">
                    <operator name="ModelApplier" class="ModelApplier">
                        <list key="application_parameters">
                        </list>
                    </operator>
                    <operator name="ClassificationPerformance" class="ClassificationPerformance">
                        <parameter key="main_criterion" value="accuracy"/>
                        <parameter key="accuracy" value="true"/>
                        <parameter key="classification_error" value="true"/>
                        <parameter key="weighted_mean_recall" value="true"/>
                        <parameter key="weighted_mean_precision" value="true"/>
                        <parameter key="absolute_error" value="true"/>
                        <parameter key="relative_error" value="true"/>
                        <list key="class_weights">
                        </list>
                    </operator>
                </operator>
            </operator>
            <operator name="ProcessLog" class="ProcessLog">
                <parameter key="filename" value="C:\Documents and Settings\Lexusboy\My Documents\RapidMiner\Supervised\processLog.log"/>
                <list key="log">
                  <parameter key="ClassificationAccuracy" value="operator.ClassificationPerformance.value.accuracy"/>
                  <parameter key="C Parameter" value="operator.LibSVMLearner.parameter.C"/>
                </list>
                <parameter key="persistent" value="true"/>
            </operator>
        </operator>
        <operator name="ParameterSetWriter" class="ParameterSetWriter">
            <parameter key="parameter_file" value="C:\Documents and Settings\Lexusboy\My Documents\RapidMiner\Supervised\gridParameters.par"/>
        </operator>
    </operator>
  • lexusboylexusboy Member Posts: 22 Maven
    Hi Haddock,

    Yes thats what was wrong .....thanks for your help ;)

    Cheers
Sign In or Register to comment.