RapidMiner

RapidMiner

Sentiment Analysis using Wordnet Dictionary

by RMStaff ‎06-24-2016 08:15 AM - edited ‎07-02-2016 05:48 AM

 

Rapidminer textmining capabilities provide several methods for Sentiment Analysis. One of the popular methods when dealing with English text is using the wordnet dictionary and relevant operators from Rapidminer Wordnet Dictionary. This article gives an overview of doing sentiment analysis using Rapidminer and the Wordnet Dictionary.

 

Prerequisites

You will need to download and install the "Wordnet Extension" from here

You will also need the "Text Processing" Extension from here

You will need to download the wordnet dictionary from here 

 

Setup steps for wordnet dictionary

The wordnet dictionary file is a file with extension "gz". You will need to use utility like 7Zip to extract it. Once you have the "WordNet-3.0.tar" file, you will unzip that further using the same 7Zip tool. You should then have a folder "Wordnet-3.0" with folders like dict, doc, include etc.

 

Once you have done this you should be ready to build a text mining process with Rapidminer and using the Wordnet Dictionary.

In the screen shot below we are searching twitter, then changing data type of the column we want to use for "text processing" and then passing the dataset(Exampleset) to "Process Documents from Data". You can replace the search twitter step with any datasource of your choice like database, excel files etc. If you would like to utilize files from a folder you can also use the "Process documents from files" or in case of email use the "process documents from mail store" operator

wordnet sentiment analysis.png

Then double click on the "Process documents from data" operator to build your text processing steps. You will add your standard text processing steps like tokenize, transform cases, filter stops words, filter tokens etc based on your specific needs. Then the two operators you need to get the sentiment score are "Open WordNet dictionary" and "Extract Sentiment(English) both coming from the Wordnet extension.

 

Configure the "Open Wordnet Dictionary" operator l

to select directory in the "resource type" parameter and then confugure the directory parameter to point to the ....\WordNet-3.0\dict folder

processdocumentdetails.png

Please explore the additional help provided with the "Extract Sentiment(Dictionary)" operator to understand the various parameters.

You can also use tthe wordnet operators for Synonyms, Hyoernyms, Hyponyms to improve on your process.

 

This process adds a new column 'sentiment" that provides a numeric value for sentiment, Negative sentiment are scored less than zero and positive sentiments are code greater than zero.

One can use the sentiment score and "Generate Attributes" operator to flag documents as Positive, Neutral, Negative etc based on the actual score value itself 

 

See the attached process for the complete example.

You can open the process in RapidMiner Studio using File(Menu) >> Import Process.

 

 

 

 

 

 

Comments
jai
Contributor

Facing following issue.. If anyone can adress that would be really great


bhupendra_patil wrote:

 

Rapidminer textmining capabilities provide several methods for Sentiment Analysis. One of the popular methods when dealing with English text is using the wordnet dictionary and relevant operators from Rapidminer Wordnet Dictionary. This article gives an overview of doing sentiment analysis using Rapidminer and the Wordnet Dictionary.

 

Prerequisites

You will need to download and install the "Wordnet Extension" from here

You will also need the "Text Processing" Extension from here

You will need to download the wordnet dictionary from here 

 

Setup steps for wordnet dictionary

The wordnet dictionary file is a file with extension "gz". You will need to use utility like 7Zip to extract it. Once you have the "WordNet-3.0.tar" file, you will unzip that further using the same 7Zip tool. You should then have a folder "Wordnet-3.0" with folders like dict, doc, include etc.

 

Once you have done this you should be ready to build a text mining process with Rapidminer and using the Wordnet Dictionary.

In the screen shot below we are searching twitter, then changing data type of the column we want to use for "text processing" and then passing the dataset(Exampleset) to "Process Documents from Data". You can replace the search twitter step with any datasource of your choice like database, excel files etc. If you would like to utilize files from a folder you can also use the "Process documents from files" or in case of email use the "process documents from mail store" operator

wordnet sentiment analysis.png

Then double click on the "Process documents from data" operator to build your text processing steps. You will add your standard text processing steps like tokenize, transform cases, filter stops words, filter tokens etc based on your specific needs. Then the two operators you need to get the sentiment score are "Open WordNet dictionary" and "Extract Sentiment(English) both coming from the Wordnet extension.

 

Configure the "Open Wordnet Dictionary" operator l

to select directory in the "resource type" parameter and then confugure the directory parameter to point to the ....\WordNet-3.0\dict folder

processdocumentdetails.png

Please explore the additional help provided with the "Extract Sentiment(Dictionary)" operator to understand the various parameters.

You can also use tthe wordnet operators for Synonyms, Hyoernyms, Hyponyms to improve on your process.

 

This process adds a new column 'sentiment" that provides a numeric value for sentiment, Negative sentiment are scored less than zero and positive sentiments are code greater than zero.

One can use the sentiment score and "Generate Attributes" operator to flag documents as Positive, Neutral, Negative etc based on the actual score value itself 

 

See the attached process for the complete example.

You can open the process in RapidMiner Studio using File(Menu) >> Import Process.

 

 

 

 

 

 


 

Screenshot from 2016-10-20 16_30_43.png

 

aluna04
Contributor

Hi,

I also have the same problem. Hope someone can help.

Thx!