IF YOU โค๏ธ RAPIDMINER, PLEASE HELP US GET TO #1 AGAIN - VOTE IN KDNUGGETS POLL 2019! ๐Ÿ™ ๐Ÿ™ ๐Ÿ™

Searching for multiple different words in a dataset

AyoubeAyoube Member Posts: 5 Learner I
(1) Is there anyway to search for multiple words all at once?
I'm using Tokenize> Filter Tokens> Condition: contain, String: the word I'm searching for, here I tried to separate the words using commas or spaces but neither worked for me.

(2) I'm searching a number of nested directories, can I get the results (e.g. the word occurrences or frequencies) per directory and also the overall results for the whole dataset?

Thanks

Answers

  • mpinedampineda Member Posts: 4 Contributor I
    Hi, in the same extension that you're using Text Processing, there is an operator named "Process Documents from Data". If you double click the operator, inside you can put all your operators to tokenize them as you wish. 

    Process Documents from Data has two outputs, the first is the example set and the second is a word list. I think that the word list is what you need. After this you an put an additional operator to convert from word list to data, with the "WordList to Data".

    I hope this is what you needed.

    Thanks.
    sgenzer
Sign In or Register to comment.