Text Clustering using K-Medoids Algorithm
I'm new to RapidMiner. I have 1000+ online reviews generated from Tripadvisor.com. I want to apply K-Medoids algorithm to cluster those reviews into k cluster. The reason why I chose K-Medoids bcs I want to find the "medoid" for each cluster, which I believe is able to represent the contents of the entire cluster. I already applied some nodes such as:
- Read Excel
- Select Attributes
- Nominal to Text
- Process Documents from Data (Tokenization, Stemming, Stopwords Removal)
- and the Clustering node itself
But I can't seem to find the proporsional cluster. From 1000+ data with k = 2, the ratio of of members of clusters 1 and 2 is 99 : 1.
Please help mee!