Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

"Cross Entropy error in calculation?"

Fred12Fred12 Member Posts: 344 Unicorn
edited June 2019 in Help

hi,

in performance(classification) operator, cross entropy is defined as sum of logarithms of confidence of true label classes divided by nb. of examples, however, I get only the correct results, if I do this but divide by number of examples +1 

;)

 

I know its not a big thing, but I spent a lot of time wondering why I get wrong results according to that definition, but then divided by nb. of ex. +1 and get right results:

Unbenannt.JPG

 

cross entropy:

-(log2(1)+log2(0.385)+log2(0.615))/3 = 0.692803777838436

but :

Unbenannt.JPG

 

I get the same if I divide by 4 instead of 3.

Tagged:

Answers

  • IngoRMIngoRM Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder

    Hi,

     

    I am not 100% sure but is this not just one of those cases where you add 1 to the denominator just to avoid the special case of the empty set and ending up with an infinite performance?

     

    Cheers,

    Ingo

  • Fred12Fred12 Member Posts: 344 Unicorn

    that sounds not logic to me, why would there be an empty set? The testset needs to have at least 1 (or maybe more) example, if not it would not be possible to calculate any performance, so there will always be at least 1 example.

    e.g if you have 1 example with 0 confidence, it would just be log2(0) /1

     

    how log2(0) is defined, thats the question (but that should be possible, there will always be some  confidence zero).. but its not dependend on some empty set I think

Sign In or Register to comment.