"The difference between using weighted vote and not using weighted vote learner"

amy · March 2009

I found that there is a k Nearest Neighbor learner in Group: Learner.Supervised.Lazy.
There is a parameter named weighted vote. I am not sure what's the difference between the weighted vote KNN and the KNN without weighted vote.
Would you like to let me know what's the difference between them? Where can I find some information on it?
It seems that there is a class named WeightedObject which has the weight. But how is the weight calculated?
I'd be very grateful if you give me a hint.
Thanks a million.

Regards

Amy

TobiasMalbrecht · March 2009

Hi Amy,

well the answer is pretty easy. The parameter specifies whether the distance of the nearest neighbors should be considered in the voting decision during prediction. If it is not considered, every nearest neighbor has the same influence on the prediction. If the parameter is enabled, neighbors which have a lower distance to the example for which a prediction is made will get a higher influence than those with a higher distance.

Regards,
Tobias

amy · March 2009

Hi Tobias,
Thank you so much for your kind reply. I have some ideas of it now.
May I ask some further questions here?
I found this topic here http://rapid-i.com/rapidforum/index.php/topic,249.0.html. It talked something about how the weight being implemented.
May I ask some further questions?
You talked about weighting by the distance, how about other similarities which is not distance like cosine similarity? How is the weight calculated? What formula is used if the measure is not distance but cosine similarity?

Thanks a million.

Amy

TobiasMalbrecht · March 2009

Hi Amy,

of course you can ask questions. That is the intention of this forum ...

The weight is calculated in the following lines in the class [tt]com.rapidminer.operator.learner.lazy.KNNClassificationModel[/tt]:


// finding next k neighbours and their distances
Collection<Tupel<Double, Integer>> neighbours =	samples.getNearestValueDistances(k, values);
for (Tupel<Double, Integer> tupel: neighbours) {
	totalDistance += tupel.getFirst();
}

double totalSimilarity = 0.0d;
if (totalDistance == 0) {
	totalDistance = 1;
	totalSimilarity = k;
} else {
	totalSimilarity = Math.max(k - 1, 1);
}

// counting frequency of labels
for (Tupel<Double, Integer> tupel : neighbours) {
	counter[tupel.getSecond()] += (1d - tupel.getFirst() / totalDistance) / totalSimilarity;
}

The weight calculation is pretty straightforward and should be easily understandable from the source code. In principal, the weighting scheme should also be the same for every distance/divergence measure.

Kind regards,
Tobias

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

"The difference between using weighted vote and not using weighted vote learner"

Answers