i did a text classification with knn. I used 383 different text-data to do a sentiment analysis.
The KNN should classify my data into 3 different labels. When I analysis my results in Rapidminer with a
it puts 122 texts in label 1, 240 in label 2 and 61 in label 3.
122 + 240 + 61 = 423
My Question: How can the classification be bigger then the dataset (383)??? Does anyone had a same problem or a solution? Thank´s for any help!
I dunno what your process is doing. If you did some text processing and created word vectors, then rotated and clustered them, you could very well have more than your original rows.