SVM Extract keywords used for sentiment

asav_yuasav_yu Member Posts: 15 Maven
edited December 2018 in Help
Good afternoon,

Hopefully somebody can help. I am playing around with sentiment analysis using SVM and results are very promising. My question is how can I easily extract a list of words from the document that I score to see exactly why the sentiment is negative or positive.

Example: I score a 100 word paragraph I want to see all keywords that SVM identified as being important. It would be great to have count as well, for example "bad" 4 times, "poor" 3 times.

Any advice is very much appreciated.

Best Answers

Answers

  • HeikoeWin786HeikoeWin786 Member Posts: 64 Contributor II
    Hello there,

    Can I use the word list I generated after pre-processed document from data operator as an input for SVM operator?
    I am having a dataset which label is binominal and review text is polynominal. I am not sure which column I need to convert to numerical to work with SVM? Sentiment label column or cutomer review text column?

    Thanks much in advance.

  • HeikoeWin786HeikoeWin786 Member Posts: 64 Contributor II
    @Telcontar120
    Hi,

    Could you please kindly explain the mentioned approach?
    I would like to use SVM to extract the aspects that are associated with label (e.g. aspect = 'service', label = positive) for each examples in the dataset.
    I am having an issue with inputting my dataset as the training dataset. It said SVM cannot accept polynomial data. However, I have 3 columns in the dataset i.e. airlines, customer review and sentiment. Could you please advise how I can transform this dataset to work with SVM? do I need to transform nominal to numeric for all 3 columns? for my data pre-processing, I am only processing the customer review by setting it as nominal to text.
    Could you please advise what I am missing here?

    thanks.
  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    To really understand what you need to do, I think you need to look over some of the text mining tutorials from the RapidMiner academy.  Basically it sounds like you are going to want to process the text of your reviews and produce word vectors to then predict the sentiment, which you will set as your label.  When you do the text processing, it will become numerical through the word vector representation.
    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
Sign In or Register to comment.