Count phrases for sentiment analysis
I know how to count words from my custom dictionary in Rapidminer. How can I count the number of occurrences of some phrases (e.g. "very good") or in other words of n-gramms? My first solution was to generate n-gramms from all words and words-combinations, then count all possible n-gramms using "Process Document"-Operator and at the end some how to get rid of all obsolete n-gramms.
Best Answer
-
bhupendra_patil Employee, Member Posts: 168 RM Data Scientist
If you are looking just for the count of certain words, n-grams etc, then the wordlist that comes out of the "Process Documents" operator will give you count of how many times the particular word or n-gram appeared, as well as how many documents it appeared in.
If you then just need to get certain words,n-grams, then i'll recommend following approach.
Use "WordList to Data" operator to convert it to exampleset and then join(left/right/inner depending on how you connect etc,) with another exampleset that has list of your words-ngrams of interest.
0
Answers
Hi ln777,
your way sounds totally fine to me. I would use a feature selection technique to get of all obsolete ones.
~Martin
Dortmund, Germany
Thank you a lot for your answer. I think it will work for my problem.