"Cross Entropy error in calculation?"

Fred12Fred12 Member Posts: 344 Unicorn
edited June 2019 in Help


in performance(classification) operator, cross entropy is defined as sum of logarithms of confidence of true label classes divided by nb. of examples, however, I get only the correct results, if I do this but divide by number of examples +1 



I know its not a big thing, but I spent a lot of time wondering why I get wrong results according to that definition, but then divided by nb. of ex. +1 and get right results:



cross entropy:

-(log2(1)+log2(0.385)+log2(0.615))/3 = 0.692803777838436

but :



I get the same if I divide by 4 instead of 3.



  • Options
    IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder



    I am not 100% sure but is this not just one of those cases where you add 1 to the denominator just to avoid the special case of the empty set and ending up with an infinite performance?




  • Options
    Fred12Fred12 Member Posts: 344 Unicorn

    that sounds not logic to me, why would there be an empty set? The testset needs to have at least 1 (or maybe more) example, if not it would not be possible to calculate any performance, so there will always be at least 1 example.

    e.g if you have 1 example with 0 confidence, it would just be log2(0) /1


    how log2(0) is defined, thats the question (but that should be possible, there will always be some  confidence zero).. but its not dependend on some empty set I think

Sign In or Register to comment.