"Confidence values"

ferandi · March 2012

Hi friends,

I'm using rapidminer to make text classification with svm(libsvm), k-nn and naive bayes algorithms. So, when i get the results of my test data, i'm not sure about how each one calculates the confidence values of each instance on each class. Can anyone help me? I need this information to my article.

Thanks in advance.

IngoRM · March 2012

Hi,

this is different for each of those algorithms:

naive bayes: the confidence is directly the calculated probability delivered by the algorithm (actually, this is one of the rare cases where the confidence IS a real probability)
k-nn: the confidence is the number of the k neighbors with the predicted class divided by k (the single values are weighted by distace in case of weighted predictions)
svm (I am not so sure about the LibSVM which brings another calculation in the multiclass case): for binomial classes, a good estimation of the probability for the positive class which is also used by RapidMiner is 1 / (1 + exp(-function_value))) where function_value is the SVM prediction

Hope that helps,
Ingo

ferandi · March 2012

Thank you very much Ingo!!!

ferandi · March 2012

Just one thing....what´s the concept of confidence on text classification?

Thanks

IngoRM · March 2012

Hi,

what´s the concept of confidence on text classification?

well, pretty much the same as for all other kinds of classification tasks. The confidence describes how certain a prediction is. Although similar to a probability of a prediction of a specific class, it is most often not the same (with exception of some learners like Naive Bayes).

The same applies for text classification, the confidence of a class value states how certain the model is that a document belongs to this class.

Cheers,
Ingo

ferandi · April 2012

Hi, Thank you very mucho for your help!
I need to clarify some aspects of my project:

I'm using three different methods to classify approximately 3000 documents in 11 categories. The methods are: k-NN, Naive Bayes and SVM (libsvm linear Kernel C-SVC). After submitting the documents for each of the testing methods generates an output value with a confidence (0-1) of the document for each category and the category chosen is having the biggest confidence.
What i´m doing is to sum the confidences of the document on each category on each 3 models and choose the label with the highest value, i guess this is called bagging, right?. Well, the fact is: my accuracy was improved about 2%. I´m yet not sure about how this confidence values are generated and normalized by Rapidminer on each model to support my conclusions. Do I have to normalize the values of each method to work together or i can consider them normalized and my result makes sense?

Many thanks in advance!

jing_ma · November 2016

Ingo, is there any documentation available for helping understand each algorithm's definition of confidence? Thanks!

Jing

MartinLiebig · November 2016

Dear Jing,

first of all: welcome to the community. There is no documentation on how our 250+ learners are calculating confidence. Most of the things are either readable in text books or in our code. Is there any operator in specific where we can help you?

~Martin

BenLie · August 2017

Here just look at the sampel

Copy from Help:

Note that in the testing set, the attributes of the first example are Outlook = sunny and Wind = false. Naive Bayes does calculation for all possible label values and selects the label value that has maximum calculated probability.

Calculation for label = yes

Find product of following:

 Posterior probability of label = yes (i.e. 9/14)
 value from distribution table when Outlook = sunny and label = yes (i.e. 0.223)
 value from distribution table when Wind = false and label = yes (i.e. 0.659)
Thus the answer = 9/14*0.223*0.659 = 0.094
Calculation for label = no

Find product of following:

 posterior probability of label = no (i.e. 5/14)
 value from distribution table when Outlook = sunny and label = no (i.e. 0.581)
 value from distribution table when Wind = false and label = no (i.e. 0.397)
Thus the answer = 5/14*0.581*0.397= 0.082
As the value for label = yes is the maximum of all possible label values, label is predicted to be yes.

And this ist how the confidence is calculated:

conf(yes) = 0.094/(0.094+0.082) = 0.534

conf(no) = 0.082/(0.094+0.082) = 0,465

Without round-off error you get:

"Confidence values"

Answers

Categories

	Posterior probability of label = yes (i.e. 9/14)
	value from distribution table when Outlook = sunny and label = yes (i.e. 0.223)
	value from distribution table when Wind = false and label = yes (i.e. 0.659)

	posterior probability of label = no (i.e. 5/14)
	value from distribution table when Outlook = sunny and label = no (i.e. 0.581)
	value from distribution table when Wind = false and label = no (i.e. 0.397)