Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

"The difference between using weighted vote and not using weighted vote learner"

amyamy Member Posts: 16 Maven
edited May 2019 in Help
I found that there is a k Nearest Neighbor learner in Group: Learner.Supervised.Lazy.
There is a parameter named weighted vote. I am not sure what's the difference between the weighted vote KNN and the KNN without weighted vote.
Would you like to let me know what's the difference between them? Where can I find some information on it?
It seems that there is a class named WeightedObject which has the weight. But how is the weight calculated?
I'd be very grateful if you give me a hint.
Thanks a million.

Regards

Amy
Tagged:

Answers

  • TobiasMalbrechtTobiasMalbrecht Moderator, Employee, Member Posts: 295 RM Product Management
    Hi Amy,

    well the answer is pretty easy. The parameter specifies whether the distance of the nearest neighbors should be considered in the voting decision during prediction. If it is not considered, every nearest neighbor has the same influence on the prediction. If the parameter is enabled, neighbors which have a lower distance to the example for which a prediction is made will get a higher influence than those with a higher distance.

    Regards,
    Tobias
  • amyamy Member Posts: 16 Maven
    Hi Tobias,
    Thank you so much for your kind reply. I have some ideas of it now.
    May I ask some further questions here?
    I found this topic here http://rapid-i.com/rapidforum/index.php/topic,249.0.html. It talked something about how  the weight being implemented.
    May I ask some further questions?
    You talked about weighting by the distance, how about other similarities which is not distance like cosine similarity? How is the weight calculated? What formula is used if the measure is not distance but cosine similarity?

    Thanks a million.

    Amy
  • TobiasMalbrechtTobiasMalbrecht Moderator, Employee, Member Posts: 295 RM Product Management
    Hi Amy,

    of course you can ask questions. That is the intention of this forum ... ;)

    The weight is calculated in the following lines in the class [tt]com.rapidminer.operator.learner.lazy.KNNClassificationModel[/tt]:

    // finding next k neighbours and their distances
    Collection<Tupel<Double, Integer>> neighbours = samples.getNearestValueDistances(k, values);
    for (Tupel<Double, Integer> tupel: neighbours) {
    totalDistance += tupel.getFirst();
    }

    double totalSimilarity = 0.0d;
    if (totalDistance == 0) {
    totalDistance = 1;
    totalSimilarity = k;
    } else {
    totalSimilarity = Math.max(k - 1, 1);
    }

    // counting frequency of labels
    for (Tupel<Double, Integer> tupel : neighbours) {
    counter[tupel.getSecond()] += (1d - tupel.getFirst() / totalDistance) / totalSimilarity;
    }
    The weight calculation is pretty straightforward and should be easily understandable from the source code. In principal, the weighting scheme should also be the same for every distance/divergence measure.

    Kind regards,
    Tobias
Sign In or Register to comment.