How to configure cost matrix for MetaCost operator

Tripartio · November 2020

Hello,

I am struggling with correctly setting up a cost matrix for the MetaCost operator. The documentation on it is quite sparse and even after reading many posts on this forum, I cannot find my answer. I also

Here is the cost matrix for the default tutorial process for the MetaCost operator (distinguishing mines from rocks in the Sonar dataset):

Class 1 is Rock; Class 2 is Mine.

That image refers to the Matlab cost matrix format (which I think is here: https://www.mathworks.com/help/stats/classification-with-unequal-misclassification-costs.html), but I still have many questions:

I assume that the 2.0 and 3.0 are costs (penalties) for misclassification, since they are for wrong predictions. The Matlab instructions say that the true positive (TP) and true negative (TN) diagonal is supposed to be left at 0, but this does not make sense to me if I have benefits. Would they not be negative (opposite of costs) in that case?

Here is my business scenario on an actual (but sample) dataset. A bank is trying to contact customers to offer a financial product. The cost of calling a customer is 5€. If a customer accepts the offer and purchases the product, the bank expects to receive revenues from each customer of 50€. So, the profit from a successful contact is 50€ - 5€ = 45€. The loss for calling a customer who declines is 5€. The bank has data from past customers and wants to create a model that can be used on new customers. The data is quite unbalanced; approximately 9% of customers said yes, and 91% said no. So, I would like to use MetaCost to indicate my priorities to the machine learner. How should I configure MetaCost in such a situation?

Here is what I would think:

That is, with "yes" as the positive class:

True positive: earns 45€, so cost is -45
True negative: we spend nothing and gain nothing, so cost is 0
False positive: we spent 5€ to call a customer but gained nothing, so cost is 5
False negative: we spent nothing, but missed the opportunity of receiving 45€ profit, so cost is 45

However, when I run my data with that cost matrix, my results are always unsatisfactory. I don't want to get into the details now (though I could if necessary), but when I calculated my total earnings in euro, it is always negative: I always end up losing money. Of course, this has to do with the difficulty of my data, so the learners rarely attain above 55% recall on the "yes" class, but still, I wonder if I am configuring the cost matrix correctly.

So, I would appreciate clear guidance on how to correctly configure the cost matrix.

Regards,

Chitu

MartinLiebig · November 2020

Hi @Tripartio ,

i would recommend to read my ebook on it: https://rapidminer.com/resource/profit-sensitive-scoring/

What I would do is to compare yourself to a naive model. How much do you gain over the default assumption of calling everybody yes?

Best,

Martin

Tripartio · November 2020

Hi @mschmitz,

Thanks for that fantastic white paper. It is very clear. One one hand, it does not answer my question of how to translate your clearly explained theory into the MetaCost operator in RapidMiner. But on the other hand, it gives the perfect opportunity to reframe my question more clearly. So, my question now is, how do I translate from this business matrix:

Image: https://us.v-cdn.net/6030995/uploads/editor/bg/vs3wk8ga5xm5.png

to this cost matrix in in the MetaCost operator:

Image: https://i.snipboard.io/ReWvmI.jpg

One important side note: your whitepaper flipped the coordinates of the confusion matrix: instead of real in the columns and predictions in the rows (which is what RapidMiner and apparently most software use), it presents real in the rows and predictions in the columns. I recommend that you update your whitepaper to match the RapidMiner orientation, since the goal is obviously to get people to use RapidMiner!

Could you please fill in the exact values from your business matrix that should be entered into the MetaCost operator matrix? Here are my two major sources of confusion (other than the flipped orientation):

The documentation note "The cost matrix in Matlab single line format" is very confusing. When I looked up the Matlab documentation ( https://www.mathworks.com/help/stats/classification-with-unequal-misclassification-costs.html), it explicitly says, "The diagonal elements C(i,i) of the cost matrix must be 0", which directly contradicts your very intuitive business matrix.
Your business matrix express numbers positively as gains or benefits and costs as negative, but "cost matrix" implies that gains or benefits should be expressed negatively and costs positively. Which is it?

Thanks,

Chitu

MartinLiebig · November 2020

Hi,

Your business matrix express numbers positively as gains or benefits and costs as negative, but "cost matrix" implies that gains or benefits should be expressed negatively and costs positively. Which is it?

The difference is just a - sign and the direction of optimization. Costs are to be minimized, gains to be maximized. If you want to switch from gains to costs you just flip all - signs.

Good catch on the predicted vs actual flip. You additionally need to transpose the matrix (i.e. flip the two off-diagonal elements).

Best,

Martin

Tripartio · November 2020

Hi @mschmitz ,

I've read everything that you've written in response, but sorry that I'm still confused. For simplicity, could you please help me by explicitly filling in the values I would use for each cell of the MetaCost cost matrix:

For simplicity, please use the numbers from the business matrix in your white paper rather than the example I gave in my original post.

Regards,

Chitu

MartinLiebig · November 2020

Hi,

I think the correct way of doing this is this:

be sure to use the right class mapping up front (see attached process)

~Martin

<?xml version="1.0" encoding="UTF-8"?><process version="9.8.000">
<context>
    <input/>
    <output/>
    <macros/>
</context>
<operator activated="true" class="process" compatibility="9.8.000" expanded="true" name="Process">
    <parameter key="logverbosity" value="init"/>
    <parameter key="random_seed" value="2001"/>
    <parameter key="send_mail" value="never"/>
    <parameter key="notification_email" value=""/>
    <parameter key="process_duration_for_mail" value="30"/>
    <parameter key="encoding" value="SYSTEM"/>
    <process expanded="true">
      <operator activated="true" class="retrieve" compatibility="9.8.000" expanded="true" height="68" name="Retrieve prepped data" width="90" x="313" y="85">
        <parameter key="repository_entry" value="//Demo Project/Direct Marketing/data/prepped data"/>
      </operator>
      <operator activated="true" class="nominal_to_binominal" compatibility="9.8.000" expanded="true" height="103" name="Nominal to Binominal" width="90" x="447" y="85">
        <parameter key="return_preprocessing_model" value="false"/>
        <parameter key="create_view" value="false"/>
        <parameter key="attribute_filter_type" value="single"/>
        <parameter key="attribute" value="Response"/>
        <parameter key="attributes" value=""/>
        <parameter key="use_except_expression" value="false"/>
        <parameter key="value_type" value="nominal"/>
        <parameter key="use_value_type_exception" value="false"/>
        <parameter key="except_value_type" value="file_path"/>
        <parameter key="block_type" value="single_value"/>
        <parameter key="use_block_type_exception" value="false"/>
        <parameter key="except_block_type" value="single_value"/>
        <parameter key="invert_selection" value="false"/>
        <parameter key="include_special_attributes" value="false"/>
        <parameter key="transform_binominal" value="false"/>
        <parameter key="use_underscore_in_name" value="false"/>
      </operator>
      <operator activated="true" class="remap_binominals" compatibility="9.8.000" expanded="true" height="82" name="Remap Binominals" width="90" x="581" y="85">
        <parameter key="attribute_filter_type" value="single"/>
        <parameter key="attribute" value="Response"/>
        <parameter key="attributes" value=""/>
        <parameter key="use_except_expression" value="false"/>
        <parameter key="value_type" value="binominal"/>
        <parameter key="use_value_type_exception" value="false"/>
        <parameter key="except_value_type" value="binominal"/>
        <parameter key="block_type" value="value_matrix_start"/>
        <parameter key="use_block_type_exception" value="false"/>
        <parameter key="except_block_type" value="value_matrix_start"/>
        <parameter key="invert_selection" value="false"/>
        <parameter key="include_special_attributes" value="true"/>
        <parameter key="negative_value" value="No"/>
        <parameter key="positive_value" value="Yes"/>
      </operator>
      <operator activated="true" class="set_role" compatibility="9.8.000" expanded="true" height="82" name="Set Role" width="90" x="715" y="85">
        <parameter key="attribute_name" value="Response"/>
        <parameter key="target_role" value="label"/>
        <list key="set_additional_roles"/>
      </operator>
      <operator activated="true" class="metacost" compatibility="9.8.000" expanded="true" height="82" name="MetaCost" width="90" x="849" y="85">
        <parameter key="cost_matrix" value="[-47.0 3.0;10.0 0.0]"/>
        <parameter key="use_subset_for_training" value="1.0"/>
        <parameter key="iterations" value="10"/>
        <parameter key="sampling_with_replacement" value="true"/>
        <parameter key="use_local_random_seed" value="false"/>
        <parameter key="local_random_seed" value="1992"/>
        <process expanded="true">
          <operator activated="true" class="concurrency:parallel_decision_tree" compatibility="9.8.000" expanded="true" height="103" name="Decision Tree" width="90" x="313" y="34">
            <parameter key="criterion" value="gain_ratio"/>
            <parameter key="maximal_depth" value="10"/>
            <parameter key="apply_pruning" value="true"/>
            <parameter key="confidence" value="0.1"/>
            <parameter key="apply_prepruning" value="true"/>
            <parameter key="minimal_gain" value="0.01"/>
            <parameter key="minimal_leaf_size" value="2"/>
            <parameter key="minimal_size_for_split" value="4"/>
            <parameter key="number_of_prepruning_alternatives" value="3"/>
          </operator>
          <connect from_port="training set" to_op="Decision Tree" to_port="training set"/>
          <connect from_op="Decision Tree" from_port="model" to_port="model"/>
          <portSpacing port="source_training set" spacing="0"/>
          <portSpacing port="sink_model" spacing="0"/>
        </process>
      </operator>
      <connect from_op="Retrieve prepped data" from_port="output" to_op="Nominal to Binominal" to_port="example set input"/>
      <connect from_op="Nominal to Binominal" from_port="example set output" to_op="Remap Binominals" to_port="example set input"/>
      <connect from_op="Remap Binominals" from_port="example set output" to_op="Set Role" to_port="example set input"/>
      <connect from_op="Set Role" from_port="example set output" to_op="MetaCost" to_port="training set"/>
      <connect from_op="MetaCost" from_port="model" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
</operator>
</process>

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

How to configure cost matrix for MetaCost operator

Answers