"Confidence values"

ferandiferandi Member Posts: 9 Contributor II
edited June 2019 in Help
Hi friends,

I'm using rapidminer to make text classification with svm(libsvm), k-nn and naive bayes algorithms. So, when i get the results of my test data, i'm not sure about how each one calculates the confidence values of each instance on each class. Can anyone help me? I need this information to my article.

Thanks in advance.
Tagged:

Answers

  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi,

    this is different for each of those algorithms:

    naive bayes: the confidence is directly the calculated probability delivered by the algorithm (actually, this is one of the rare cases where the confidence IS a real probability)
    k-nn: the confidence is the number of the k neighbors with the predicted class divided by k (the single values are weighted by distace in case of weighted predictions)
    svm (I am not so sure about the LibSVM which brings another calculation in the multiclass case): for binomial classes, a good estimation of the probability for the positive class which is also used by RapidMiner is 1 / (1 + exp(-function_value))) where function_value is the SVM prediction

    Hope that helps,
    Ingo
  • ferandiferandi Member Posts: 9 Contributor II
    Thank you very much Ingo!!!
  • ferandiferandi Member Posts: 9 Contributor II
    Just one thing....what´s the concept of confidence on text classification?


    Thanks
  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi,

    what´s the concept of confidence on text classification?
    well, pretty much the same as for all other kinds of classification tasks. The confidence describes how certain a prediction is. Although similar to a probability of a prediction of a specific class, it is most often not the same (with exception of some learners like Naive Bayes).

    The same applies for text classification, the confidence of a class value states how certain the model is that a document belongs to this class.

    Cheers,
    Ingo
  • ferandiferandi Member Posts: 9 Contributor II
    Hi, Thank you very mucho for your help!
    I need to clarify some aspects of my project:

    I'm using three different methods to classify approximately 3000 documents in 11 categories. The methods are: k-NN, Naive Bayes and SVM (libsvm linear Kernel C-SVC). After submitting the documents for each of the testing methods generates an output value with a confidence (0-1) of the document for each category and the category chosen is having the biggest confidence.
    What i´m doing is to sum the confidences of the document on each category on each 3 models and choose the label with the highest value, i guess this is called bagging, right?. Well, the fact is: my accuracy was improved about 2%. I´m yet not sure about how this confidence values are generated and normalized by Rapidminer on each model to support my conclusions. Do I have to normalize the values of each method to work together or i can consider them normalized and my result makes sense?

    Many thanks in advance!
  • jing_majing_ma Member Posts: 2 Contributor I

    Ingo, is there any documentation available for helping understand each algorithm's definition of confidence? Thanks!

    Jing

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,503 RM Data Scientist

    Dear Jing,

     

    first of all: welcome to the community. There is no documentation on how our 250+ learners are calculating confidence. Most of the things are either readable in text books or in our code. Is there any operator in specific where we can help you?

     

    ~Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • BenLieBenLie Member Posts: 1 Contributor I

    Here just look at the sampel

     Copy from Help: 


    Note that in the testing set, the attributes of the first example are Outlook = sunny and Wind = false. Naive Bayes does calculation for all possible label values and selects the label value that has maximum calculated probability.

    Calculation for label = yes

    Find product of following:

    Posterior probability of label = yes (i.e. 9/14)
    value from distribution table when Outlook = sunny and label = yes (i.e. 0.223)
    value from distribution table when Wind = false and label = yes (i.e. 0.659)
    Thus the answer = 9/14*0.223*0.659 = 0.094

    Calculation for label = no

    Find product of following:

    posterior probability of label = no (i.e. 5/14)
    value from distribution table when Outlook = sunny and label = no (i.e. 0.581)
    value from distribution table when Wind = false and label = no (i.e. 0.397)
    Thus the answer = 5/14*0.581*0.397= 0.082

    As the value for label = yes is the maximum of all possible label values, label is predicted to be yes.


    And this ist how the confidence is calculated:

     

    conf(yes) = 0.094/(0.094+0.082) = 0.534

    conf(no) = 0.082/(0.094+0.082) = 0,465

     

    Without round-off error you get:

    Bayes.PNG

     

Sign In or Register to comment.