prediction confidence

nav · May 2011

hello, someone can explain me how the prediction confidence columns work and how are calculated when I apply a classification model on a test set. Thanks.

IngoRM · May 2011

Hi,

whoa, that's a pretty broad question with no definite answer for all learning schemes.

In general, the prediction confidences state how sure the model was for each of the possible values. This is similar to probabilities ("how large is the probability that the class is "positive"?) but not necessarily the same.

How they are calculated? Well, that differs for all model types. For schemes like Naive Bayes and Logistic Regression, the confidences are indeed the probabilities based on the seen training data. If you use an SVM and apply scalings like Platt scaling, it is at least pretty close. For other schemes, things might be different. For example, the confidences of decision trees are the fraction of the class in the applicable leaf against the total number of cases in this leaf.

There are only two ways: Simply accept the confidences as a measurement of how sure the model is and believe it. Or do it the hard way: read all the literature about all the model types and learn how they are calculated in detail. The source code might also help here.

Cheers,
Ingo

adaman · May 2011

Hey together,

this is all fine for me, no need to understand it all in detail, but i would like to put a threshold on the confidence after the model applier is finsihed to get only some examples with a lower or higher threshold. But i can´t, as the confidence is a spezial attribute? or am i doing something complete wrong

IngoRM · June 2011

Hi,

well, you have several options for this.

You could use

the operator "Generate Attributes" (you will have to rename the confidence attributes before since the parentheses would cause problems otherwise...)
one of the discretization operators
the operator "Drop Uncertain Predictions" (although this one does not exactly divide your data into discrete bins...)

If the fact that the confidence is a special attribute is a problem somewhere, you could either check the setting "include special attributes" or use the operator "Set Role" before the data transformation is applied.

Here is an example using the operator "Generate Attributes":


<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.008">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.1.008" expanded="true" name="Process">
    <process expanded="true" height="359" width="815">
      <operator activated="true" class="generate_data" compatibility="5.1.008" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30">
        <parameter key="target_function" value="sum classification"/>
        <parameter key="number_examples" value="500"/>
      </operator>
      <operator activated="true" class="add_noise" compatibility="5.1.008" expanded="true" height="94" name="Add Noise" width="90" x="179" y="30">
        <parameter key="random_attributes" value="1"/>
        <list key="noise"/>
      </operator>
      <operator activated="true" class="naive_bayes" compatibility="5.1.008" expanded="true" height="76" name="Naive Bayes" width="90" x="313" y="30"/>
      <operator activated="true" class="generate_data" compatibility="5.1.008" expanded="true" height="60" name="Generate Data (2)" width="90" x="179" y="165">
        <parameter key="target_function" value="sum classification"/>
        <parameter key="number_examples" value="200"/>
      </operator>
      <operator activated="true" class="add_noise" compatibility="5.1.008" expanded="true" height="94" name="Add Noise (2)" width="90" x="313" y="165">
        <parameter key="random_attributes" value="1"/>
        <list key="noise"/>
      </operator>
      <operator activated="true" class="apply_model" compatibility="5.1.008" expanded="true" height="76" name="Apply Model" width="90" x="447" y="30">
        <list key="application_parameters"/>
      </operator>
      <operator activated="true" class="rename_by_replacing" compatibility="5.1.008" expanded="true" height="76" name="Rename by Replacing" width="90" x="581" y="30">
        <parameter key="include_special_attributes" value="true"/>
        <parameter key="replace_by" value="_"/>
      </operator>
      <operator activated="true" class="generate_attributes" compatibility="5.1.008" expanded="true" height="76" name="Generate Attributes" width="90" x="715" y="30">
        <list key="function_descriptions">
          <parameter key="discretized" value="if (confidence_negative_&gt;0.8,&quot;high&quot;,&quot;low&quot;)"/>
        </list>
      </operator>
      <connect from_op="Generate Data" from_port="output" to_op="Add Noise" to_port="example set input"/>
      <connect from_op="Add Noise" from_port="example set output" to_op="Naive Bayes" to_port="training set"/>
      <connect from_op="Naive Bayes" from_port="model" to_op="Apply Model" to_port="model"/>
      <connect from_op="Generate Data (2)" from_port="output" to_op="Add Noise (2)" to_port="example set input"/>
      <connect from_op="Add Noise (2)" from_port="example set output" to_op="Apply Model" to_port="unlabelled data"/>
      <connect from_op="Apply Model" from_port="labelled data" to_op="Rename by Replacing" to_port="example set input"/>
      <connect from_op="Rename by Replacing" from_port="example set output" to_op="Generate Attributes" to_port="example set input"/>
      <connect from_op="Generate Attributes" from_port="example set output" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>

And here is an example using one of the discretization operators:


<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.008">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.1.008" expanded="true" name="Process">
    <process expanded="true" height="359" width="815">
      <operator activated="true" class="generate_data" compatibility="5.1.008" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30">
        <parameter key="target_function" value="sum classification"/>
        <parameter key="number_examples" value="500"/>
      </operator>
      <operator activated="true" class="add_noise" compatibility="5.1.008" expanded="true" height="94" name="Add Noise" width="90" x="179" y="30">
        <parameter key="random_attributes" value="1"/>
        <list key="noise"/>
      </operator>
      <operator activated="true" class="naive_bayes" compatibility="5.1.008" expanded="true" height="76" name="Naive Bayes" width="90" x="313" y="30"/>
      <operator activated="true" class="generate_data" compatibility="5.1.008" expanded="true" height="60" name="Generate Data (2)" width="90" x="179" y="165">
        <parameter key="target_function" value="sum classification"/>
        <parameter key="number_examples" value="200"/>
      </operator>
      <operator activated="true" class="add_noise" compatibility="5.1.008" expanded="true" height="94" name="Add Noise (2)" width="90" x="313" y="165">
        <parameter key="random_attributes" value="1"/>
        <list key="noise"/>
      </operator>
      <operator activated="true" class="apply_model" compatibility="5.1.008" expanded="true" height="76" name="Apply Model" width="90" x="447" y="30">
        <list key="application_parameters"/>
      </operator>
      <operator activated="true" class="discretize_by_frequency" compatibility="5.1.008" expanded="true" height="94" name="Discretize" width="90" x="581" y="30">
        <parameter key="attribute_filter_type" value="single"/>
        <parameter key="attribute" value="confidence(negative)"/>
        <parameter key="include_special_attributes" value="true"/>
      </operator>
      <connect from_op="Generate Data" from_port="output" to_op="Add Noise" to_port="example set input"/>
      <connect from_op="Add Noise" from_port="example set output" to_op="Naive Bayes" to_port="training set"/>
      <connect from_op="Naive Bayes" from_port="model" to_op="Apply Model" to_port="model"/>
      <connect from_op="Generate Data (2)" from_port="output" to_op="Add Noise (2)" to_port="example set input"/>
      <connect from_op="Add Noise (2)" from_port="example set output" to_op="Apply Model" to_port="unlabelled data"/>
      <connect from_op="Apply Model" from_port="labelled data" to_op="Discretize" to_port="example set input"/>
      <connect from_op="Discretize" from_port="example set output" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>

Cheers,
Ingo

adaman · June 2011

thx :-) for the hints

sinead_bracken · November 2017

Hi there,

just to add on this:

are the values indicative of how sure we are in the sense of if the confidence value is 0.785, could we say we are 78.5% confident that this prediction falls into this category?

Or is it more along the lines of 78.5% of entries like this fall into this category too?

Telcontar120 · November 2017

Another good, but sometimes difficult, question. Generally the confidence can be interpreted in both senses, because the first sense (confidence in the prediction) is actually based on the second sense (distribution of similar records). However, this number itself is highly dependent on the specifics of the training dataset, so it is susceptible to "drift" when applying the score to other datasets. Most of the time, the scores are more robust as rank-ordering tools, so that even if the underlying distributions of classes shift, they preserve the correct ordering even if the absolute probabilities shift.

rafeena · January 2020

@sinead_bracken and everyone else. regarding this question does it mean the confidence is like the accuracy measurement of the performance

Telcontar120 · January 2020

No, I would say that accuracy is something different, at least as usually defined in a machine learning context. Accuracy is typically a measure of overall model performance, such as derived from the confusion matrix for a classification problem, and as shown in the performance operators in RapidMiner.
This is related but is ultimately not the same as the confidence for an individual prediction (or even set of predictions) and it is itself subject to skew based on the confidence threshold selected for classification purposes (see the earlier part of this same thread for a discussion of setting thresholds).

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

prediction confidence

Answers