Options

# Naive Bayes Probabilities

Member Posts: 2 Contributor I
edited November 2019 in Help
Hi,

I am trying to use Naive Bayes model for classifying text data for sentiment. I created a training model and applying it on test data. I want to know the probabilities Naive Bayes model is assigning to each of the word in the word vector. I have the distribution from the training model for all the words in the training data. But I would like to see the exact probabilities assigned to each of the word in test data for the two categories.
I need this because, I would like to check the impact of Laplace Correction on new words in my test data that are not present in the training word list. I am using "Binary Term Occurrences" during vector creation.
The funny thing that is happening is - when I have one test record, it is classified as negative. But when I add another record to the test data, the first record now is getting classified as positive! I don't understand why introducing another record should change the classification of first record.

Is there anyway to see the exact probabilities for each record in my test data calculated from the model? I am using RapidMiner 5.1.
Tagged:

`			for (Attribute attribute : exampleSet.getAttributes()) {				double value = example.getValue(attribute);				if (nominal) {					if (!Double.isNaN(value)) {						int intValue = (int) value;						for (int i = 0; i < numberOfClasses; i++) {							if (intValue < distributionProperties.length) {								probabilities += distributionProperties[intValue];							}						}					} else {						for (int i = 0; i < numberOfClasses; i++) {							probabilities += distributionProperties[distributionProperties.length - 1];						}					}				} else {					if (!Double.isNaN(value)) {						for (int i = 0; i < numberOfClasses; i++) {							double base = (value - distributionProperties[INDEX_MEAN]) / distributionProperties[INDEX_STANDARD_DEVIATION];							probabilities -= distributionProperties[INDEX_LOG_FACTOR] + 0.5 * base * base;						}					}				}				j++;			}`