Discretization

iamevanthegreatiamevanthegreat Member Posts: 1 Contributor I
edited November 2018 in Help

Hi! I'm new to rapidminer and I'm trying to run classification models (kNN, Decision Tree, and Naive Naive Bayes). Since some of the values in my dataset are contiunuous I applied descritization technique. If I will run the processed data on kNN model will it work without a problem? (Since the values are already in ranges and not the actual data and kNN needs the real values to compute the distance). Thanks!

Tagged:

Answers

  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn

    k-NN will technically work with discretized values, although it will probably be much less effective because it will use a nominal distance measure, which is basically just 1 if two observations are in different categories (no matter how far apart they would be in the original numerical scale).  You would be much better off to go back to the non-discretized values of the original numerical data for k-NN.  But don't forget to normalize those numerical values to ensure that differences in attribute scaling do not affect your results.

     

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
Sign In or Register to comment.