The RapidMiner community is on read-only mode until further notice. Technical support via cases will continue to work as is. For any urgent licensing related requests from Students/Faculty members, please use the Altair academic forum here.
Why does Naive Bayes return a confidence either 0 or 1 for every sample?
fstarsinic
Member Posts: 20 Contributor II
I'm just guessing but is this telling me that there is some attribute the algorithm is keying on and discarding everything else? Is there a way to take the results and look at the predictions + the other attributes together in a correlation matrix to see if that is the case? I can't picture that with NB. Seems more of an NN kinda thing or a tree thing.
Anyway, 0 and 1 only?... that can't be a good sign. What does that indicate?
Anyway, 0 and 1 only?... that can't be a good sign. What does that indicate?
Tagged:
0
Answers
If I understand this correctly, you want to find a correlation between predicted output and regular attributes used in model. If so, yes you can use correlation matrix operator and connect it to the "exa" port of performance operator to correlation matrix and select "include special attribute" option in correlation matrix operator.
Also, what does performance metrics indicate? Is this model predicting with high accuracy?
Do let us know if you need more info.
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing
this is what looks odd to me. only a few test samples here but always the same regardless of sample size. confidence (predicting 0 or 1) is always either 0% or 100%. Seems likely something is wrong.
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing
Yes all 0s or 1s for confidence with nothing else. I checked the data. here's a sample of it.
the vertical axis above is the number of samples. the horizontal axis shows the different confidence values (only 2)
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing
Machine learning is also based on No free lunch theorem. We never know exactly which algorithm fits our data, which is the reason we try to apply multiple models.
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing
You may see how this affects and how you could solve it on this videos.
https://academy.rapidminer.com/learn/video/sampling-weighting-intro
https://academy.rapidminer.com/learn/video/sampling-weighting-demo
and for the Naive
https://academy.rapidminer.com/learn/video/naive-bayes-intro
https://academy.rapidminer.com/courses/nave-bayes-demo