🎉 🎉   RAPIDMINER 9.5 BETA IS OUT!!!   🎉 🎉

GRAB THE HOTTEST NEW BETA OF RAPIDMINER STUDIO, SERVER, AND RADOOP. LET US KNOW WHAT YOU THINK!

CLICK HERE TO DOWNLOAD

🦉 🎤   RapidMiner Wisdom 2020 - CALL FOR SPEAKERS   🦉 🎤

We are inviting all community members to submit proposals to speak at Wisdom 2020 in Boston.


Whether it's a cool RapidMiner trick or a use case implementation, we want to see what you have.
Form link is below and deadline for submissions is November 15. See you in Boston!

CLICK HERE TO GO TO ENTRY FORM

RM 9.4 feedback (official release) : Costs/Benefits calculation

lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 781   Unicorn
Dear all,

First thanks you for implementing the costs/benefits calculus in this new release - I think lot of users (including me) waited for this new feature.

2 months ago I had several questions in this thread about the Costs/Benefits calcultation and thanks to @IngoRM to answer me, that's was clear : 

https://community.rapidminer.com/discussion/55904/questions-on-rapidminer-9-4-beta-new-releases

But in this official release , I'm seeing that "Total Cost/Benefit (expected) and the associated average were abandoned. My first question is why ?

 The "Total Cost/Benefit (expected)" and the associated average are replaced by : 
 - "Total for best option"
 - "Gain"

My second question is  : can you explain how this 2 numbers are calculated (despite my efforts i was not able to retrieve them) and why these 2 new numbers are more relevant than the "Total Cost/Benefit (expected)" ?

Here my attempt to retrieve these 2 numbers with the Titanic Dataset with all options by default in AutoModel with NB model : 




Third question : in the new column called "cost" why the cost is not counted as negative when the prediction is wrong (I suppose the following cost matrix as the following) :

 






Thanks you for your listening,

Regards,

Lionel
Tagged:
varunm1Tghadially

Best Answers

Answers

  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 781   Unicorn
    Hi Ingo,

    Yes, your long and detailed explanation helps me a lot to understand these new concepts of Benefits/Costs. #noblackboxes  :)
    Thank you for spending your time answering my questions.

    Now you'll think I'm picky about the details, but I will quote the deutsch philosopher Friedrich Nietzsche : "The Devil is in the details"  >:)
    I begin  : 
    The 3 money indicators (Total Cost/Benefits, Total for Best Option, Gain) are calculated on the whole validation set (ie for the Titanic dataset on 524 examples [1309 examples x 40%]) : 



    But the displayed confusion matrix is NOT builded on the whole validation test : 



    Here we can see that the number of examples used to build this confusion matrix (always for the Titanic) is 
    219 + 135 + 7 + 14 = 375 examples A priori due to the factor 5 /7 introduced by the Performance Average (Robust) operator.

    My question is for a question of homogeneity of the results, should the 3 moneys indicators not be calculated with this displayed confusion matrix ? In other words, actually, the displayed money indicators don't correspond directly to the displayed confusion matrix ...

    Thanks you for your patience and your listening...

    Regards,

    Lionel



    sgenzerTghadially
  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 781   Unicorn
    You got me there
    So Friedrich Nietzsche was right ..... >:)

    More seriouly, I agree with your point of view, Ingo,  and once again, thanks for taking the time to answer me.

    Regards,

    Lionel 
    sgenzerTghadiallyIngoRM
  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,256   Unicorn
    This is a very interesting discussion.  I haven't had a chance to dive into this new operator yet, but I had a couple of questions.
    @IngoRM how is the new operator different from the existing Performance(Costs) operator?  Or is it?
    It appears that they require the same inputs (a class order and then a misclassification cost matrix). In this framework, are you still allowed to enter benefits as negative costs?

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
    Tghadially
Sign In or Register to comment.