04-08-2017 09:21 AM - edited 04-08-2017 09:22 AM
Hi! I'm new to rapidminer and I'm trying to run classification models (kNN, Decision Tree, and Naive Naive Bayes). Since some of the values in my dataset are contiunuous I applied descritization technique. If I will run the processed data on kNN model will it work without a problem? (Since the values are already in ranges and not the actual data and kNN needs the real values to compute the distance). Thanks!
04-08-2017 10:54 AM
k-NN will technically work with discretized values, although it will probably be much less effective because it will use a nominal distance measure, which is basically just 1 if two observations are in different categories (no matter how far apart they would be in the original numerical scale). You would be much better off to go back to the non-discretized values of the original numerical data for k-NN. But don't forget to normalize those numerical values to ensure that differences in attribute scaling do not affect your results.