The Altair Community is migrating to a new platform to provide a better experience for you. The RapidMiner Community will merge with the Altair Community at the same time. In preparation for the migration, both communities are on read-only mode from July 15th - July 24th, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here.

preparation of data for kmeans

josejose Member Posts: 16 Contributor II
hello, my question is this ..
I want to use the k Means to group data from texts. such as the following:
ugly cat
cute dog
cat intestine
barking dog
cow gives milk
cow is in the valley
chicken eats corn ..
I want to group data by animals .. I can do this?
if only I have the text .. What are the steps I have to do to use kmeans? .. How do I prepare the data?


  • Options
    MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn

    do you know the video tutorials about RapidMiner on,en ? There are also some videos about text mining.

    Best, Marius
  • Options
    josejose Member Posts: 16 Contributor II
    Thanks Marius. The truth is that the videos did not help me much. I did not know what attributes using the k-means to generate the classification of texts. I generate the frequency matrix and then apply the k-means, this genre I cluster. And the classification was relatively good. I wanted that I would group by topic.

    My question is:
    Is there another way of classifying or grouping texts? for best results.
  • Options
    dudesterdudester Member Posts: 15 Maven
    There is also Tree cluster methods as well as EM (Expectation Maximization) clustering; see variable clustering methods and data mining.  I don't know if such an operator (EM) exists in DM...
    You may have to get creative.
Sign In or Register to comment.