Options

Extraction of sentences based on a wordlist (to create a new doc)

DneHenDneHen Member Posts: 1 Newbie
Hello,

For the purpose of my thesis I have to analyze multiple corporate reports. I have to extract from these reports sentences that contains specific words (from a wordlist) and create a document with all the selected sentences, which will be used later for further analysis. 

For that I used first a "read document" operator. Then I used a "process document" operator that contains a tokenize operator (based on linguistic sentences). After (still inside the process documents operator), I used the filter tokens by content and I put in the string parameter the specific words that I want in the retained sentences. 

My problem is that I can't put all the selected sentences in a list where they can be read easily seperatly. Each selected sentence becomes an attribute. I think my problem is not complicated but I can't find an answer on the forum that solves my problem. 

I don't know much about data and how to use Rapidminer to do textmining (first time). I would like to apologize because it is possible that the answer is on the forum and I am the one who is doing the research wrong. 

Thank you! 
Sign In or Register to comment.