ProcessLog

Legacy UserLegacy User Member Posts: 0 Newbie
edited November 2018 in Help
Hi,

I'm trying to log the results of a leave-one-out cross validation. This
is my (simplified) mode:

<operator name="Root" class="Process" expanded="yes">
    <operator name="ExampleSource" class="ExampleSource">
    <parameter key="attributes" value="../data/labor-negotiations.aml"/>
    </operator>
    <operator name="XValidation" class="XValidation" expanded="yes">
        <parameter key="leave_one_out" value="true"/>
        <parameter key="number_of_validations" value="5"/>
        <operator name="DecisionTree" class="DecisionTree">
        </operator>
        <operator name="OperatorChain" class="OperatorChain" expanded="yes">
            <operator name="ModelApplier" class="ModelApplier">
                <list key="application_parameters">
                </list>
            </operator>
            <operator name="ClassificationPerformance" class="ClassificationPerformance">
                <parameter key="accuracy" value="true"/>
                <parameter key="classification_error" value="true"/>
            </operator>
            <operator name="ProcessLog" class="ProcessLog">
                <parameter key="filename" value="process.log"/>
                <list key="log">
                  <parameter key="classification_error" value="operator.ClassificationPerformance.value.classification_error"/>
                  <parameter key="accurracy" value="operator.ClassificationPerformance.value.accuracy"/>
                  <parameter key="deviation" value="operator.XValidation.value.deviation"/>
                </list>
                <parameter key="persistent" value="true"/>
            </operator>
        </operator>
    </operator>
</operator>
So, I'm trying to log the classification error and the accuracy of the ClassificationPerformance
operator as well as the deiation of the XValidation operator. However, the output is somehow unexpected:

# Generated by ProcessLog[com.rapidminer.operator.visualization.ProcessLogOperator]
# classification_error  accurracy      deviation
0.0    1.0    0.4175823272122516
0.0    1.0    0.4175823272122516
0.0    1.0    0.4175823272122516
...
I'm surprised why the first two values which vary between 0.0 and 1.0 do not
indicate the real classification error and accuracy measured for the validation
of a particular run. Also, I don't really understand why the deviation is always
the same. Should it not vary during the cross validation when different
accuracy values are collected?

If I'm doing something wrong, how can I correctly log the results of a
cross validation?

Regards,
Paul

Answers

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi Paul,
    the output isn't really unexpected since you are logging the performance of every one of the application steps of the crossvalidation. Since you use a Leave one out valididation, each application works on a set of 1 example. Hence the accuracy can be either 0 or 1, as the error does. A deviation does not make any sense at all, since there is never any deviation if you have only one example...
    If you want to log the total result, then you should set it up like this:
    <operator name="Root" class="Process" expanded="yes">
        <operator name="ExampleSource" class="ExampleSource">
            <parameter key="attributes" value="C:\Dokumente und Einstellungen\sland\Eigene Dateien\Yale\RapidMiner_Zaniah\sample\data\labor-negotiations.aml"/>
        </operator>
        <operator name="XValidation" class="XValidation" expanded="yes">
            <parameter key="leave_one_out" value="true"/>
            <parameter key="number_of_validations" value="5"/>
            <operator name="DecisionTree" class="DecisionTree">
            </operator>
            <operator name="OperatorChain" class="OperatorChain" expanded="yes">
                <operator name="ModelApplier" class="ModelApplier">
                    <list key="application_parameters">
                    </list>
                </operator>
                <operator name="ClassificationPerformance" class="ClassificationPerformance">
                    <parameter key="accuracy" value="true"/>
                    <parameter key="classification_error" value="true"/>
                </operator>
            </operator>
        </operator>
        <operator name="ProcessLog" class="ProcessLog">
            <list key="log">
              <parameter key="classification_error" value="operator.XValidation.value.performance1"/>
              <parameter key="accurracy" value="operator.XValidation.value.performance2"/>
              <parameter key="deviation" value="operator.XValidation.value.deviation"/>
            </list>
        </operator>
    </operator>
    Greetings,
      Sebastian
Sign In or Register to comment.