How to connect SentiWordNet to RapidMiner?

SindhuCMASindhuCMA Member Posts: 4 Contributor I
edited November 2018 in Help
Sir,

I m a beginner and have just started learning to use RapidMiner. SentiWordNet is in a text file, how do i connect it to prune using sentiment score? I m done with tokenizing, stop words filtering and stemming with the help of wordnet. Help me plz. Thanks in advance :)

Answers

  • Nils_WoehlerNils_Woehler Member Posts: 463 Maven
    Hi,

    you can try out the WordNet extension available via the update manager.

    Best,
    Nils
  • SindhuCMASindhuCMA Member Posts: 4 Contributor I
    Hi Sir,

    SentiWordNet is a text file. OpenWordNetDictionary can be used only for exe files. how do i extract the scores from SentiWordNet for further processing.... pls help me. Thanks in advance


    Regards,
    Sindhu
  • SindhuCMASindhuCMA Member Posts: 4 Contributor I
    Somebody please help me connect sentiwordnet to rapidminer..

    Thanks, SindhuCMA
  • startx25startx25 Member Posts: 7 Contributor II
    hI sindhuCMA,

    Did you find a solution to use textfile "sentiwordnet" to rapidminer dictionnary ?
    I hope you can help me

    Regards
  • mwengleinmwenglein Member Posts: 1 Contributor I
    Any answers on this one? I'm looking for a quick example how to use SentiWordNet to score a directory of documents. The overall handling of RapidMiner is really useful and my process is already extracting, tokenizing and cleaning the data nicely. All I'm missing is a way to use the text format of SentiWordNet to derive sentiment.

  • mrfarhankhanmrfarhankhan Member Posts: 14 Contributor II
    I'm also looking for same. Any help ?
  • Nils_WoehlerNils_Woehler Member Posts: 463 Maven
    There is currently no way to use SentiWordnet with RapidMiner. However we have developed a new operator which will be added to the WordNet extension. It will be released with the next update of the WordNet extension.

    Cheers,
    Nils
  • Nils_WoehlerNils_Woehler Member Posts: 463 Maven
    Hi,

    just a short update: We have released the new version (5.3.0) of the Wordnet extension today. It includes an operator operator to work with SentiWordNet.

    Cheers,
    Nils
  • mrfarhankhanmrfarhankhan Member Posts: 14 Contributor II
    Hello Nils,

    I have downloaded this extension on RM6.3 and I cannot find the operator to SentiWordNet. There are only two options that I see. "resource type" and "directory" Do I have to install any other thing?

    Thanks
    Farhan !
    Nils wrote:

    Hi,

    just a short update: We have released the new version (5.3.0) of the Wordnet extension today. It includes an operator operator to work with SentiWordNet.

    Cheers,
    Nils
  • Nils_WoehlerNils_Woehler Member Posts: 463 Maven
    Hi Farhan,

    did you download the Wordnet extension? If not please go to "Help>Updates and Extensions" and search for the Wordnet extension.
    After downloading and restarting you should be able to find the new operator "Extract Sentiment (English)".

    Cheers,
    Nils
  • mrfarhankhanmrfarhankhan Member Posts: 14 Contributor II
    Thanks for your response Nils. I did find this Extract Sentiment operator. Can you please refer me to a small example of how to use it ?
  • Nils_WoehlerNils_Woehler Member Posts: 463 Maven
    Sure. Here you go:

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="6.4.000-SNAPSHOT">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="6.4.000-SNAPSHOT" expanded="true" name="Process">
        <process expanded="true">
          <operator activated="true" class="text:create_document" compatibility="6.1.001-SNAPSHOT" expanded="true" height="60" name="Create Document" width="90" x="45" y="75">
            <parameter key="text" value="Thanks for checking out this amazing example process! This is a really great example for checking sentiment."/>
          </operator>
          <operator activated="true" class="text:tokenize" compatibility="6.1.001-SNAPSHOT" expanded="true" height="60" name="Tokenize" width="90" x="179" y="75"/>
          <operator activated="true" class="wordnet:open_wordnet_dictionary" compatibility="5.3.000" expanded="true" height="60" name="Open WordNet Dictionary" width="90" x="45" y="255">
            <parameter key="directory" value="C:\Users\nwoehler\Downloads\WordNet-3.0\dict"/>
          </operator>
          <operator activated="true" class="text:create_document" compatibility="6.1.001-SNAPSHOT" expanded="true" height="60" name="Create Document (2)" width="90" x="45" y="390">
            <parameter key="text" value="Great camera, easy to use, good looking, high quality images.&#10;"/>
          </operator>
          <operator activated="true" class="text:tokenize" compatibility="6.1.001-SNAPSHOT" expanded="true" height="60" name="Tokenize (2)" width="90" x="179" y="390"/>
          <operator activated="true" class="multiply" compatibility="6.4.000-SNAPSHOT" expanded="true" height="112" name="Multiply" width="90" x="179" y="255"/>
          <operator activated="true" class="wordnet:find_sentiment_wordnet" compatibility="5.3.000" expanded="true" height="76" name="Extract Sentiment (2)" width="90" x="380" y="345">
            <parameter key="threshold" value="0.0"/>
            <parameter key="use_nouns" value="false"/>
            <parameter key="use_verbs" value="false"/>
            <parameter key="use_adverbs" value="false"/>
          </operator>
          <operator activated="true" class="text:documents_to_data" compatibility="6.1.001-SNAPSHOT" expanded="true" height="76" name="Documents to Data (2)" width="90" x="648" y="345">
            <parameter key="text_attribute" value="text"/>
          </operator>
          <operator activated="true" class="wordnet:find_sentiment_wordnet" compatibility="5.3.000" expanded="true" height="76" name="Extract Sentiment (English)" width="90" x="380" y="165"/>
          <operator activated="true" class="text:documents_to_data" compatibility="6.1.001-SNAPSHOT" expanded="true" height="76" name="Documents to Data" width="90" x="648" y="165">
            <parameter key="text_attribute" value="text"/>
          </operator>
          <operator activated="true" class="text:create_document" compatibility="6.1.001-SNAPSHOT" expanded="true" height="60" name="Create Document (3)" width="90" x="45" y="480">
            <parameter key="text" value="Memory too small, slow charge between shots. "/>
          </operator>
          <operator activated="true" class="text:tokenize" compatibility="6.1.001-SNAPSHOT" expanded="true" height="60" name="Tokenize (3)" width="90" x="179" y="480"/>
          <operator activated="true" class="wordnet:find_sentiment_wordnet" compatibility="5.3.000" expanded="true" height="76" name="Extract Sentiment (3)" width="90" x="380" y="480">
            <parameter key="threshold" value="0.0"/>
            <parameter key="use_nouns" value="false"/>
            <parameter key="use_verbs" value="false"/>
            <parameter key="use_adverbs" value="false"/>
          </operator>
          <operator activated="true" class="text:documents_to_data" compatibility="6.1.001-SNAPSHOT" expanded="true" height="76" name="Documents to Data (3)" width="90" x="648" y="480">
            <parameter key="text_attribute" value="text"/>
          </operator>
          <operator activated="true" class="append" compatibility="6.4.000-SNAPSHOT" expanded="true" height="112" name="Append" width="90" x="849" y="255"/>
          <connect from_op="Create Document" from_port="output" to_op="Tokenize" to_port="document"/>
          <connect from_op="Tokenize" from_port="document" to_op="Extract Sentiment (English)" to_port="document"/>
          <connect from_op="Open WordNet Dictionary" from_port="dictionary" to_op="Multiply" to_port="input"/>
          <connect from_op="Create Document (2)" from_port="output" to_op="Tokenize (2)" to_port="document"/>
          <connect from_op="Tokenize (2)" from_port="document" to_op="Extract Sentiment (2)" to_port="document"/>
          <connect from_op="Multiply" from_port="output 1" to_op="Extract Sentiment (English)" to_port="dictionary"/>
          <connect from_op="Multiply" from_port="output 2" to_op="Extract Sentiment (2)" to_port="dictionary"/>
          <connect from_op="Multiply" from_port="output 3" to_op="Extract Sentiment (3)" to_port="dictionary"/>
          <connect from_op="Extract Sentiment (2)" from_port="document" to_op="Documents to Data (2)" to_port="documents 1"/>
          <connect from_op="Documents to Data (2)" from_port="example set" to_op="Append" to_port="example set 2"/>
          <connect from_op="Extract Sentiment (English)" from_port="document" to_op="Documents to Data" to_port="documents 1"/>
          <connect from_op="Documents to Data" from_port="example set" to_op="Append" to_port="example set 1"/>
          <connect from_op="Create Document (3)" from_port="output" to_op="Tokenize (3)" to_port="document"/>
          <connect from_op="Tokenize (3)" from_port="document" to_op="Extract Sentiment (3)" to_port="document"/>
          <connect from_op="Extract Sentiment (3)" from_port="document" to_op="Documents to Data (3)" to_port="documents 1"/>
          <connect from_op="Documents to Data (3)" from_port="example set" to_op="Append" to_port="example set 3"/>
          <connect from_op="Append" from_port="merged set" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    Load the Wordnet dictionary once, tokenize your document and pass it to Extract Sentiment. Afterwards the document has a Sentiment annotation which can be transformed back to data with Documents to Data.

    Cheers,
    Nils
  • sukhsukh Member Posts: 43 Contributor II
    i have followed the same instruction as you have said in your post. Everything is accordingly but the extract sentiment operator is asking for the dictionary as an input.  when i provide usr/local/wordnet3.0/ dict it gave me an error message as:

    Error reading /usr/local/WordNet-3.0/dict:map failed
    the given resource could not be read and parsed .Please make sure the file is well formed and parsing parameters are specified correctly.

    Reason: map failed

    cause : Open WordNet Dictionary



    please help in solving this error.
    thanks.
  • awchisholmawchisholm RapidMiner Certified Expert, Member Posts: 458 Unicorn
    You have to place the SentiWordNet text file in the same folder as the WordNet dictionary

    Andrew
  • sukhsukh Member Posts: 43 Contributor II
    Thanks a lot for your reply over the subject.
    I followed your instructions, unfortunately got the same results.
    Kindly guide me the exact steps i need to follow.
    Looking forward for your reply.
    Thanks in advance.
    Regards:
    Sukh
  • awchisholmawchisholm RapidMiner Certified Expert, Member Posts: 458 Unicorn
    Hello Sukh

    You will need to post your process before anyone will invest free time on this.

    regards

    Andrew
  • sukhsukh Member Posts: 43 Contributor II
    Thanx alot Sir,

    The process i am following for sentiment analysis, is as follows:

    Step 1: Download the latest version of Wordnet (WordNet-3.0.tar.gz) from the link below:

          https://wordnet.princeton.edu/wordnet/download/current-version/
    Step 2: installed the latest version of rapid miner(Rapid Miner Studio 6.3.000)
    Step 3: use the operator Process Document-> tokenization >filter stop words> extract sentiment
    Step 4: in the operator extract sentiment, it needs a dictionary as an input.When i gave it the link as \usr\local\WordNet-3.0\dict
    it gave me the following error message:
    Reason: map failed.
    The given resource could not be read and parsed.Please make sure the file is well formed and parsing parameters are specified correctly.
    cause : Open WordNet Dictionary
    Thanx.

    Regards:
    Sukh

  • awchisholmawchisholm RapidMiner Certified Expert, Member Posts: 458 Unicorn
    Hello sukh

    It looks like the contents of the folder where the WordNet components should be is incorrect.

    You seem to be on a Unix system although I notice the path is "\" not the expected "/".

    Could it be as simple as changing the delimiter? I don't know whether the Open WordNet operator is able to cope with both.

    regards

    Andrew
  • sukhsukh Member Posts: 43 Contributor II
    Sir, i used the "/" but i mistakenly post it as "\".
    how can i check the contents of the WordNet-3.0 dictionary are correct or not.i also added sentiWordnet text file in dict folder in  wordnet-3.0.
    Kindly help me.
    thnx
  • awchisholmawchisholm RapidMiner Certified Expert, Member Posts: 458 Unicorn
    Hello Sukh

    I'm seeing 25 files in my dict folder with names like data.verb, index.adj. The SentiWord txt file is also there.

    The total size of all files is around 48Mb

    regards

    Andrew
  • sukhsukh Member Posts: 43 Contributor II
    Sir, i have also 25 files in the dict folder, now i have tried to open simple WordNet-3.0./dict folder for stemming.It gave me the same error. it is not the matter of exact sentiment operator , actually it is even unable to open wordnet for stemming too.I couldnot figure out what is the actual reason of map failed and how to fix it in linux.it works well in windows.




    Thanks and Regards:
    Sukh
  • sukhsukh Member Posts: 43 Contributor II
    image
  • awchisholmawchisholm RapidMiner Certified Expert, Member Posts: 458 Unicorn
    Yes it is more fundamental than Sentiment Analysis because it fails in the "open wordnet" operator.

    If it works with Windows but not for Linux then it's hard because I run Windows. Perhaps you can try to find some sort of system utility that monitors which files are being accessed by what process to confirm that RapidMiner is really able to see and read the folder containing the files.

    regards

    Andrew

  • awchisholmawchisholm RapidMiner Certified Expert, Member Posts: 458 Unicorn
    I just tried this on an Ubuntu machine and it worked without difficulty.

    Please post the XML of your process.

    Andrew
Sign In or Register to comment.