Count specific 2 words in many documents

PiciaPicia Member Posts: 11 Contributor II
in Help
I would like to ask for help with the following problem. I have a lot of txt documents and I need to count the number of occurences of specific words in these documents. That is, in every document how many times there is word1 and word2. For instance document1: word1 = 2 times, word2 = 7 times. I am capable of counting one word, but I need two.
I tokenize words, transform cases, filter tokens - and "filter tokens" operator seems to accept only one string.

Answers

  • kaymankayman Member Posts: 662 Unicorn
    You can use the wordlist to data operator for this. Once you have tokenized all your words in the process documents operator the 'wor' output gives you what you need, the exact count of all tokens (words) you have. The wordlist to data converts this list into an exampleset and then you use a filter to get your exact 2 words, with the count of them.
    BalazsBarany
Sign In or Register to comment.