RapidMiner

How to Optimize Meta-Cost Matrix

SOLVED
Contributor II

How to Optimize Meta-Cost Matrix

Hi, 

 

We have a classification process to attempt to predict infrequent events in a large dataset, and are using the meta-cost operator to place more value on the performance of these events (like 1 in 1,000) to minimize false negatives . Is there a method to optimize the cost matrix to class recall values, or does the user just need to iterate thru the cost matrix to arrive at acceptable values?

 

thanks!

See more topics labeled with:

1 ACCEPTED SOLUTION

Accepted Solutions
Contributor II
Solution
Accepted by topic author michaelgloven
2 weeks ago

Re: How to Optimize Meta-Cost Matrix

Hi Brian,

 

I'm acuatlly working with technical support on getting further documentation regarding this operator. Once I get more info I'll post back.

 

Mike

4 REPLIES
Highlighted
Elite III

Re: How to Optimize Meta-Cost Matrix

I am not sure I totally understand your question--are you talking about MetaCost or Performance(Costs)?  You might want to look at Performance(Costs) which also allows you to use a cost matrix. Whatever performance operator you put inside your cross-validation (and select the main performance criterion, if there is more than one available) is what will be optimized.  You can then put the entire cross-validation inside an optimization operator if you want to do a grid search across different parameters as well.  

 

Brian T., Lindon Ventures - www.lindonventures.com
Analytics Consulting by Certified RapidMiner Analysts
Contributor II
Solution
Accepted by topic author michaelgloven
2 weeks ago

Re: How to Optimize Meta-Cost Matrix

Hi Brian,

 

I'm acuatlly working with technical support on getting further documentation regarding this operator. Once I get more info I'll post back.

 

Mike

RMStaff

Re: How to Optimize Meta-Cost Matrix

Dear Mike,

 

i think MetaCost is not the sole operator you want to focus on. I do think weights/sampling and threshold finding is of equal importance.


First of all you need to define yourself a performance measure which reflects your needs. It should be higher for the cases which are of higher value for you and lower for the others. This can be on a class or example level.

 

Afterwards you train an algorithm. Most algorithms are first of all biased towards the majority class. One way to overcome this is to up/down sample or to use weights. My personal "quick fix" is the Weight by Stratification operator. It adds a weight attribute where sum of weights for all classes is equal. I would set Sum of weights roughly to your #examples.

If the learner you use works with weights, it will now balance both classes. You can of course try to increase the weights by scaling it to direct your learner into the direction you want it.

Tightly connect to this is the option to change the threshold when you start to call it class A. by default we take the maximum confidence and assign this class. With the threshold you can set the thresholds by hand. That way you can do things like "only if confidence(fraud)>0.9 call it fraud".

The concrete value for this would be a metaparameter of the model.

 

Cheers,

Martin

--------------------------------------------------------------------------
Head of Data Science Services at RapidMiner
Contributor II

Re: How to Optimize Meta-Cost Matrix

good ideas, easy to implement. thanks!