**RAPIDMINER 9.7 BETA ANNOUNCEMENT**

### The beta program for the RapidMiner 9.7 release is now available. Lots of amazing new improvements including true version control!

### CLICK HERE TO DOWNLOAD

# "leave_one_out_performance_problem"

bojansimoski
Member Posts:

**2**Contributor I
Hello guys,

so i'm using X-validation for my analysis and i have one question about interpreting the results i have from the performance operator.. So for the accuracy of the classifier i have something like : accuracy: 65.38% +/- 36.08% ; And my question is about the second argument i have here : 36.08% ... What is this? And how is computed ? I need to mention that i use leave one out technique ..

Many Thanks!!

so i'm using X-validation for my analysis and i have one question about interpreting the results i have from the performance operator.. So for the accuracy of the classifier i have something like : accuracy: 65.38% +/- 36.08% ; And my question is about the second argument i have here : 36.08% ... What is this? And how is computed ? I need to mention that i use leave one out technique ..

Many Thanks!!

Tagged:

0

## Answers

1,869UnicornThe first part of the displayed accuracy is the mean accuracy of all N models, and the second part is the standard deviation.

Best,

Marius

2Contributor I1,869UnicornBest, Marius

7Contributor IIAnd interpreting the results in that situation they are strange.

I got results

84.26 +/- 36.08 or 63.38 +/- 47.57

and if in both cases I assume that this standart deviation is computed as sqrt(p(1-p)). Taking as p=accuracy (so p=0.8426. for instance) I got then the value 0f the standard deviation shown . In the example sqrt(0.8426(1-0.8426)). But this I think is not ok, bacause accuracy is not a bernoulli distribution. I think the value should be further divided by sqrt(N).... So my question is as Bojan how is this standard deviation computed?

thank you?

AMT

1,869UnicornBest,

Marius

7Contributor IIBut here I do not think that it is what it was used. With one example you got correct and non-correct.

At the the end of the n iterations, a count variable with a binomial distribution is obtained as at each iteration a bernoulli distribution.

And what I was pointing it is that this standard deviation seems to be estimated using the formulas of the standard deviation for a bernoulli distribution ----- sqrt(p(1-p))) ------ and this I did not found in wikipedia page you point. So how it is really estimated the standard deviation.

Another point it is how you interpret a result like the ones I showed where performance can have such large spread? Even being larger than 100%?

1,869Unicornyou can transform: p(1-p) = p - p^2, which is equivalent to the standard formula for the standard deviation where the values are only 0 or 1.

Best,

Marius

7Contributor IIBut this is the point. I think that to compute the std (standard deviation) of the accuracy you need further divide by sqrt(n) ... What do you think?

Greetings

A.M. Tomé

1,869UnicornWith accuracy values in 0 and 1 the usefulness of this value is certainly questionable. Same applies to the +- notation, since it's not the error of the accuracy.

We will discuss that here at Rapid-I. Thanks for your input!

Best,

~Marius

7Contributor IIAny new about this comment?

AMT

1,869UnicornBest regards,

Marius