Options

# RM 9.4 feedback (official release) : Costs/Benefits calculation

Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
Dear all,

First thanks you for implementing the costs/benefits calculus in this new release - I think lot of users (including me) waited for this new feature.

2 months ago I had several questions in this thread about the Costs/Benefits calcultation and thanks to @IngoRM to answer me, that's was clear :

https://community.rapidminer.com/discussion/55904/questions-on-rapidminer-9-4-beta-new-releases

But in this official release , I'm seeing that "Total Cost/Benefit (expected) and the associated average were abandoned. My first question is why ?

The "Total Cost/Benefit (expected)" and the associated average are replaced by :
- "Total for best option"
- "Gain"

My second question is  : can you explain how this 2 numbers are calculated (despite my efforts i was not able to retrieve them) and why these 2 new numbers are more relevant than the "Total Cost/Benefit (expected)" ?

Here my attempt to retrieve these 2 numbers with the Titanic Dataset with all options by default in AutoModel with NB model :

Third question : in the new column called "cost" why the cost is not counted as negative when the prediction is wrong (I suppose the following cost matrix as the following) :

Regards,

Lionel
Tagged:

• Options
Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
Hi Ingo,

Yes, your long and detailed explanation helps me a lot to understand these new concepts of Benefits/Costs. #noblackboxes

Now you'll think I'm picky about the details, but I will quote the deutsch philosopher Friedrich Nietzsche : "The Devil is in the details"
I begin  :
The 3 money indicators (Total Cost/Benefits, Total for Best Option, Gain) are calculated on the whole validation set (ie for the Titanic dataset on 524 examples [1309 examples x 40%]) :

But the displayed confusion matrix is NOT builded on the whole validation test :

Here we can see that the number of examples used to build this confusion matrix (always for the Titanic) is
219 + 135 + 7 + 14 = 375 examples A priori due to the factor 5 /7 introduced by the Performance Average (Robust) operator.

My question is for a question of homogeneity of the results, should the 3 moneys indicators not be calculated with this displayed confusion matrix ? In other words, actually, the displayed money indicators don't correspond directly to the displayed confusion matrix ...

Regards,

Lionel

• Options
Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
`You got me there`
So Friedrich Nietzsche was right .....

More seriouly, I agree with your point of view, Ingo,  and once again, thanks for taking the time to answer me.

Regards,

Lionel
• Options
Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
This is a very interesting discussion.  I haven't had a chance to dive into this new operator yet, but I had a couple of questions.
@IngoRM how is the new operator different from the existing Performance(Costs) operator?  Or is it?
It appears that they require the same inputs (a class order and then a misclassification cost matrix). In this framework, are you still allowed to enter benefits as negative costs?

Brian T.
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts