confidence values from w-logistic seem out of range

bobdobbs · July 2009

Hi,

Testing out some data with the w-logistic operator.

I trained it on about 20,000 examples with two input variables and a binomial label. ("sick", "not_sick")

I then ran a test set of 1900 examples through the resulting model. (about 130 of them are "sick")

The w-logistic model returns confidence estimates for the "sick" class that are at the highest .28

I assumed that was the "probability" of the example being in the sick class.

What is odd is that out of the 20 highest scoring examples (score from .233 to .254) 14 of them are labeled as "sick" . This is 70% of the examples. So it appears as if the w-logistic model is picking class members with a 70% probability. If so, then why am I seeing confidence scores of .233???

Can anyone shed some light on this apparent discrepancy?

land · July 2009

Hi,
the weka learners are a black box for us as much as they are for you. So I cannot explain this behavior. If you would replace it by our own logistic regression model, we could take a look at every strange behavior

Greetings,
Sebastian

bobdobbs · July 2009

Sebastian,

Your suggestion made a HUGE difference. The RM Logistic Regression model is delivering results that look very consistent. Much more like we expected.

Thank You ;D

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

confidence values from w-logistic seem out of range

Answers