Handling missing values

petrovdenis · July 2011

Dear all,

I have created the following work-flows:
1. with k-NN classifier.
2. with Naive Bayes classifier.
3. with Weka:W-J48 classifier.

For the J48 classifier there is a link to the Ross Quinlan (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo, CA.
There the description of handling of missing values is given.

How does k-NN and Naive Bayes handle the missing values?

dan_agape · July 2011

Hi,

When scoring/classifying an example, a Naive Bayes classifier just does not use the attributes with missing values in that example precisely for calculations, and this is probabilistically perfectly sound. That is, when evaluating the probability an example to belong to a certain class given the evidence provided by the values of that example, in Bayes' formula the conditional probabilities for the attributes with missing values are simply omitted. If the same attributes have known values in other examples to score, they are used when scoring those examples. Also, missing values are obviously not considered when estimating sample means and sample standard deviations for those attributes (for each of the classes), that the algorithm calculates when building the classifier.

Regarding k-NN, conventionally this algorithm calculates distances between examples as if all their values were known. When there are missing values, basically one needs to get rid of them before these distances are computed. Various implementations of k-NN may include various simple preprocessing techniques to handle missing values, as for instance replacing missing values with averages/ modes, discarding attributes with too many missing values from the analysis, discarding examples with (too many) missing values, etc. I guess RM's k-NN may use replacing missing values with averages/modes before distances are calculated.

Other less-trivial techniques to deal with missing values, usually used as separated preprocessing operations, include imputation in which a supervised learning technique is used to find a suitable value there where there is a missing value. Finding the most similar example and picking the known value of the attribute A in that example as value for A in the current example (with missing value for A), is a simple illustration of this technique. See Impute Missing Values operator in RM.

Dan

petrovdenis · July 2011

Hi Dan,

Thank you for your response.

But I wanted to find out which algorithm was used for pre-processing. I think I should take a look to the implementation of k-NN.

And now before the Classification operator I use "Replace Missing Values" operator (there are the possibilities you described). Thank you for the advise to take a look to Impute Missing Values operator.

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

Handling missing values

Answers