RapidMiner

[SOLVED] Set (or Optimize) Class Recall for a Classification Model

Contributor

[SOLVED] Set (or Optimize) Class Recall for a Classification Model

Is it possible to "set" the class recall for a given classification model's class and have RapidMiner yield the model and parameters?  For example, a Decision Tree with two output labels: Y and N.  I would like the N class's recall to be at least 80%.  I'm thinking there has to be a way to do this?  Perhaps some type of loop with the MetaCost operator?  Something better?

An ideal situation - optimize the tree's parameters given the critereia of having the N class's recall at least 80%.  Yes, I realize increasing N's recall will result in lost recall of Y.

Any help or guidance is greatly appreciated.

Jason
2 REPLIES
Highlighted
Frequent User

Re: Set (or Optimize) Class Recall for a Classification Model

Hi,

the Select Recall operator is your friend Smiley Happy

When you apply a classification model, additionally to the hard, binary classification (true/false), it generates so called confidences, i.e. how "sure" is the model that an example is actually from the predicted class. Usually, when the confidence is higher than 50%, the model predicts the respective class. By changing that threshold, you can influence the recall of the model.

The Select Recall operator must be applied on the classified example set (i.e. after Apply Model). It generates a threshold, which must then be applied to the example set with Apply Threshold. The process below depicts the general approach.

Best regards,
Marius

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.006">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.3.006" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="retrieve" compatibility="5.3.006" expanded="true" height="60" name="Retrieve Sonar" width="90" x="45" y="30">
        <parameter key="repository_entry" value="//Samples/data/Sonar"/>
      </operator>
      <operator activated="true" class="naive_bayes" compatibility="5.3.006" expanded="true" height="76" name="Naive Bayes" width="90" x="179" y="30"/>
      <operator activated="true" class="apply_model" compatibility="5.3.006" expanded="true" height="76" name="Apply Model" width="90" x="313" y="30">
        <list key="application_parameters"/>
      </operator>
      <operator activated="true" class="select_recall" compatibility="5.3.006" expanded="true" height="76" name="Select Recall" width="90" x="447" y="30"/>
      <operator activated="true" class="apply_threshold" compatibility="5.3.006" expanded="true" height="76" name="Apply Threshold" width="90" x="581" y="30"/>
      <connect from_op="Retrieve Sonar" from_port="output" to_op="Naive Bayes" to_port="training set"/>
      <connect from_op="Naive Bayes" from_port="model" to_op="Apply Model" to_port="model"/>
      <connect from_op="Naive Bayes" from_port="exampleSet" to_op="Apply Model" to_port="unlabelled data"/>
      <connect from_op="Apply Model" from_port="labelled data" to_op="Select Recall" to_port="example set"/>
      <connect from_op="Select Recall" from_port="example set" to_op="Apply Threshold" to_port="example set"/>
      <connect from_op="Select Recall" from_port="threshold" to_op="Apply Threshold" to_port="threshold"/>
      <connect from_op="Apply Threshold" from_port="example set" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>
Contributor

Re: Set (or Optimize) Class Recall for a Classification Model

Thank you Marius.  This is EXACTLY what I was looking for - never even saw the Select Recall operator.  Your help is, as always, greatly appreciated.