The Altair Community is migrating to a new platform to provide a better experience for you. The RapidMiner Community will merge with the Altair Community at the same time. In preparation for the migration, both communities are on read-only mode from July 15th - July 24th, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here.
Options

Discretization

iamevanthegreatiamevanthegreat Member Posts: 1 Contributor I
edited November 2018 in Help

Hi! I'm new to rapidminer and I'm trying to run classification models (kNN, Decision Tree, and Naive Naive Bayes). Since some of the values in my dataset are contiunuous I applied descritization technique. If I will run the processed data on kNN model will it work without a problem? (Since the values are already in ranges and not the actual data and kNN needs the real values to compute the distance). Thanks!

Tagged:

Answers

  • Options
    Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn

    k-NN will technically work with discretized values, although it will probably be much less effective because it will use a nominal distance measure, which is basically just 1 if two observations are in different categories (no matter how far apart they would be in the original numerical scale).  You would be much better off to go back to the non-discretized values of the original numerical data for k-NN.  But don't forget to normalize those numerical values to ensure that differences in attribute scaling do not affect your results.

     

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
Sign In or Register to comment.