Options

Count phrases for sentiment analysis

In777In777 Member Posts: 29 Contributor II
edited November 2018 in Help

I know how to count words from my custom dictionary in Rapidminer. How can I count the number of occurrences of some phrases (e.g. "very good") or in other words of n-gramms? My first solution was to generate n-gramms from all words and words-combinations, then count all possible n-gramms using "Process Document"-Operator and at the end some how to get rid of all obsolete n-gramms.

Best Answer

  • Options
    bhupendra_patilbhupendra_patil Administrator, Employee, Member Posts: 168 RM Data Scientist
    Solution Accepted

    If you are looking just for the count of certain words, n-grams etc, then the wordlist that comes out of the "Process Documents" operator will give you count of how many times the particular word or n-gram appeared, as well as how many documents it appeared in.

     

    If you then just need to get certain words,n-grams, then i'll recommend following approach.

    Use "WordList to Data" operator to convert it to exampleset and then join(left/right/inner depending on how you connect etc,) with another exampleset that has list of your words-ngrams of interest.

Answers

  • Options
    MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,511 RM Data Scientist

    Hi ln777,

     

    your way sounds totally fine to me. I would use a feature selection technique to get of all obsolete ones.

     

    ~Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • Options
    In777In777 Member Posts: 29 Contributor II

    Thank you a lot for your answer. I think it will work for my problem.

Sign In or Register to comment.