Options

Regarding Text Classification

sudheendrasudheendra Member Posts: 22 Maven
edited October 2019 in Help
Hai,

I have 1000 Text documents. I want to classify these records on the basis of some words in the document, ie if the document contains a particular number of words(word1, word2......... word10) I need to classify these as a group. I have already tried it with clustering algorithm and got around 20 clusters.But there I couldn't find any option for the above mentioned type of classification. Is there any way to classify the records on the basis of input word list.

Thanks,
Sudheendra

Answers

  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    of course. But you didn't learn anything at all then. You simply could use an attribute construction operator, adding the if clauses and generate a new label attribute.
    But this isn't text mining at all...

    Greetings,
    Ā  Sebastian
  • Options
    sudheendrasudheendra Member Posts: 22 Maven
    Hi Sebastain,

    I already worked with attribute construction operator using numerical attributes.If we can use the same operator in Text data how will I label to "Type A" if the text contains "payment " and "claimant".

    Thanks,
    Sudheendra

  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    wasn't it you, whom I recommended to read a book about text mining? It will become clear to you,Ā  then. The word vector representation in TFIDF is just the very basic. Sorry, but without knowledge of that, it doesn't make sense to continue.

    Greetings,
    Ā  Sebastian
Sign In or Register to comment.