Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

Extraction of sentences based on a wordlist (to create a new doc)

DneHenDneHen Member Posts: 1 Learner I
Hello,

For the purpose of my thesis I have to analyze multiple corporate reports. I have to extract from these reports sentences that contains specific words (from a wordlist) and create a document with all the selected sentences, which will be used later for further analysis. 

For that I used first a "read document" operator. Then I used a "process document" operator that contains a tokenize operator (based on linguistic sentences). After (still inside the process documents operator), I used the filter tokens by content and I put in the string parameter the specific words that I want in the retained sentences. 

My problem is that I can't put all the selected sentences in a list where they can be read easily seperatly. Each selected sentence becomes an attribute. I think my problem is not complicated but I can't find an answer on the forum that solves my problem. 

I don't know much about data and how to use Rapidminer to do textmining (first time). I would like to apologize because it is possible that the answer is on the forum and I am the one who is doing the research wrong. 

Thank you! 
Sign In or Register to comment.