Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

Searching for multiple different words in a dataset

AyoubeAyoube Member Posts: 5 Learner I
edited June 2019 in Help
(1) Is there anyway to search for multiple words all at once?
I'm using Tokenize> Filter Tokens> Condition: contain, String: the word I'm searching for, here I tried to separate the words using commas or spaces but neither worked for me.

(2) I'm searching a number of nested directories, can I get the results (e.g. the word occurrences or frequencies) per directory and also the overall results for the whole dataset?

Thanks

Answers

  • mpinedampineda Member Posts: 4 Contributor I
    Hi, in the same extension that you're using Text Processing, there is an operator named "Process Documents from Data". If you double click the operator, inside you can put all your operators to tokenize them as you wish. 

    Process Documents from Data has two outputs, the first is the example set and the second is a word list. I think that the word list is what you need. After this you an put an additional operator to convert from word list to data, with the "WordList to Data".

    I hope this is what you needed.

    Thanks.
Sign In or Register to comment.