Options

SOLVED: Problem with Sentiment Analysis Wizard

cmvb15cmvb15 Member Posts: 3 Contributor I
edited October 2019 in Help
Hi!

I am very new to RapidMiner and trying to test some features and explore what it can do for my company.  The Sentiment Analysis feature seems veryy interesting and useful, but it is not working for me.

Using the ´test dataset´ it runs without problem.  But I tried using my own data and it refused to work.  Assuming of course a simple error on my part, I adapted the data (switched/renamed the columns, tried different import formats [.xlsx, .xls, .csv, and data directly imported to the tool], switched the language [from portuguese to english], used a smaller file [30 lines], and exported the test data to re-import).  All of these variations failed.

Despite my formatting being correct, the error always states: "The application process cannot be executed on your data.  Please make sure the structure of your data matches the one from the demo data.  Also, make sure to select an appropriate column in Step 3."

Any advice on what I should do?
And, if I get this fixed, would this tool also work to analyze ´sentiment´ in brasilian portuguese?

Thanks a lot.  I´m really hoping that we can get this fixed.  This seems like a perfect tool for my company to use!


Comments

  • Options
    Marco_BoeckMarco_Boeck Administrator, Moderator, Employee, Member, University Professor Posts: 1,995 RM Engineering
    Hi,

    I think the problem which occurs most of the time is that the "Sentiment" column is not correct. For the predictive model to work a column is required which contains some labels (describing the sentiment, e.g. text is "positive" or "negative"). However, for the model to actually classify something, not all labels must be filled. If for every row a sentiment is already set, there is nothing to classify. On the contrary, if for every row the sentiment is missing, there is nothing it can base its classification on - the model has to know some examples which tell it "this is a positive text" or "this is a negative text".
    If either of these requirements is not met, nothing can be classified.

    Regards,
    Marco
  • Options
    cmvb15cmvb15 Member Posts: 3 Contributor I
    Hi Marco!

    Thanks for the reply!  After deleting some of the sentiments, the test data works.  :D

    Now, do you know how to make it interpret responses in other languages?  (specifically brasilian portuguese)

    Thanks so much!
  • Options
    Marco_BoeckMarco_Boeck Administrator, Moderator, Employee, Member, University Professor Posts: 1,995 RM Engineering
    Hi,

    you cannot easily modify the Wizard itself to suit your needs. However, the wizard only makes use of functionality available in RapidMiner Studio 6 anyway, so you can adapt the process behind the wizard to your needs. You will not get the nice dashboard with this approach, however everything displayed there can be displayed in the results perspective of RapidMiner Studio manually as well.
    To do so, I recommend you familiarize yourself with RapidMiner Studio a bit first (if you haven't yet) by following the Tutorials.
    After you have done so, please follow these instructions:

    1) Run the wizard on your data

    2) On the final results dashboard, click on "Show the process" in the bottom right corner. This will take you to the main process design perspective of RapidMiner Studio, where you can create and modify your own processes. Do not panic, it looks quite complicated but you don't need to understand everything right now ;)

    3) In your case, what you need to do is exchange the stopword list, which is english by default for the wizard. The stopwords are words which are not useful for classification, like "a", "one", "who", etc.

    3.1) To do so, double click on the operator shown in the picture below. Replace the "Filter Stopwrods" operator with the "Filter Stopwords (Dictionary)" operator. This operator expects a simple .txt file on your harddisk containing stopwords (one per row) for the language of your choice. Select the operator, and add the list in the "file" parameter.

    image

    3.2) Connect the input and output ports as they were before.

    4) Voila, the hardest part is over! Now save the process, and run it! You will see three results.

    4.1) The first one, called "Example Set (Guess Types)" displays the classification you saw before in the wizard dashboard. The other two results can be used to re-create the charts from the dashboard via the "Charts" tab on the left.


    This whole process might seem daunting at first, but once you get the hang of it, you will notice how much power and control you have at your hands.

    Regards,
    Marco
  • Options
    cmvb15cmvb15 Member Posts: 3 Contributor I
    Hey Marco!

    Works like a charm  ;D  Thanks a lot!


    FYI, I found a stoplist for portuguese here:  http://snowball.tartarus.org/algorithms/portuguese/stop.txt
    and it seems they have it for almost all languages as well: http://snowball.tartarus.org/algorithms/

    Thanks a lot for the help,
    -Cathy
  • Options
    Marco_BoeckMarco_Boeck Administrator, Moderator, Employee, Member, University Professor Posts: 1,995 RM Engineering
    Hi,

    cool, you're welcome! And thanks for sharing :)

    Regards,
    Marco
  • Options
    lmcdowelllmcdowell Member Posts: 4 Contributor I
    Hello. I am also a newbie and am having the same problem with the error message "The application process cannot be executed on your data. Please make sure the structure of your data matches the one from the demo data. Also, make sure to select an appropriate column in Step 3." The data structure matches the demo data. After reading this forum, I went back and removed some of the positive and negative entries but the file is still not working and I still get the same error message. I am using the starter version of Rapid Miner. Could that be the reason?  ???
  • Options
    Marco_BoeckMarco_Boeck Administrator, Moderator, Employee, Member, University Professor Posts: 1,995 RM Engineering
    Hi,

    please see my answer in this thread: http://rapid-i.com/rapidforum/index.php/topic,8238.0.html

    Regards,
    Marco
  • Options
    rsadoddinrsadoddin Member Posts: 1 Contributor I
    Hello Marco,
    I have exactly the same problem mentioned here. I am checking the structure. It seems exactly what is required. Two columns with specified headers, and with positive, negative or missing values.
    I am getting the same error message.

    Would please comment on this issue clearly? I checked the other thread and it didn't solve my problem.
  • Options
    Marco_BoeckMarco_Boeck Administrator, Moderator, Employee, Member, University Professor Posts: 1,995 RM Engineering
    Hi,

    it's hard to say what is going on without seeing your data. Would it be possible to send me a small number of samples via PM with which the problem occurs? I can then either help you or at least file a bug report if it's a yet unknown issue.

    Regards,
    Marco
  • Options
    rajbanokhanrajbanokhan Member Posts: 29 Maven
    edited October 2019
    hi sir in above diagram or process for sentiment analysis i didn't find these operators: training data, process training, weights, model and evaluation, drop words, process apply, apply data. which extension we need for this?
  • Options
    sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager
    hi @rajbanokhan this is a VERY old thread. 😀 You should get the text processing extension from the Marketplace.

    I would strongly recommend going through the Text and Web Mining online tutorial on the RapidMiner Academy. It will walk you through everything.

    Scott

Sign In or Register to comment.