NancyNancy Member Posts: 9 Contributor II
edited August 2019 in Help
I am working with a text document.It contains around 1000 small paragraphs.My objective is to group the words which frequently repeating  in the paragraphs or sentences (need not continuous words but those words are in that paragraph) .Which operarator can I use to classify the document on the basis of group of words.

Nancy :D


  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    sorry, but what you are asking for seems not to be consistent. In the first sentence you are explaining, that you are going to group the words. In the question, you want to classify the documents. From this I cannot comprehend what your real objective is and how I could help you, reaching it.

  • NancyNancy Member Posts: 9 Contributor II
    Hi Sebastain,

    My objective is to classify the documents.But now I want to group the words.After that I will classify the entire documents on the basis of these words.So can you suggest any way to group the words.

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    if you want to select words, which will be suitable for later classification, you could use feature selection or simply weighting.
    Otherwise, you would have to specify, would objective you have with grouping the words. Every combination is a group, so just building a grouping isn't much sensible.

