Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
SVM Extract keywords used for sentiment
Good afternoon,
Hopefully somebody can help. I am playing around with sentiment analysis using SVM and results are very promising. My question is how can I easily extract a list of words from the document that I score to see exactly why the sentiment is negative or positive.
Example: I score a 100 word paragraph I want to see all keywords that SVM identified as being important. It would be great to have count as well, for example "bad" 4 times, "poor" 3 times.
Any advice is very much appreciated.
Hopefully somebody can help. I am playing around with sentiment analysis using SVM and results are very promising. My question is how can I easily extract a list of words from the document that I score to see exactly why the sentiment is negative or positive.
Example: I score a 100 word paragraph I want to see all keywords that SVM identified as being important. It would be great to have count as well, for example "bad" 4 times, "poor" 3 times.
Any advice is very much appreciated.
Tagged:
0
Best Answers
-
B00100719 Member Posts: 11 Contributor IIAssuming you have used the 'Tokenize' operator, 'filter stopwords', 'transform cases', maybe also 'filter by length' experiment with and without 'stem' (probably not useful for such a small document), be sure to check "Create Word Vector" when using the 'Process Documents From Data Opeartor" which contains all these operators and also set the lower and upper pruning on that operator too - also requires experiments with different values. The interesting words will likely be those that appear a medium number of times.
7 -
Telcontar120 RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 UnicornYou can also build your SVM model using the tokenized words and then use the Explain Predictions operator afterwards, which will help identify the terms that are most strongly associated with the label prediction for different groups of examples.
6 -
MartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,525 RM Data ScientistHi,
in a linear svm you can also use the attribute weights which are delivered as a measure for the importance of a word for the overall descision.
Best,
MArtin- Sr. Director Data Solutions, Altair RapidMiner -
Dortmund, Germany6
Answers
Can I use the word list I generated after pre-processed document from data operator as an input for SVM operator?
I am having a dataset which label is binominal and review text is polynominal. I am not sure which column I need to convert to numerical to work with SVM? Sentiment label column or cutomer review text column?
Thanks much in advance.
Hi,
Could you please kindly explain the mentioned approach?
I would like to use SVM to extract the aspects that are associated with label (e.g. aspect = 'service', label = positive) for each examples in the dataset.
I am having an issue with inputting my dataset as the training dataset. It said SVM cannot accept polynomial data. However, I have 3 columns in the dataset i.e. airlines, customer review and sentiment. Could you please advise how I can transform this dataset to work with SVM? do I need to transform nominal to numeric for all 3 columns? for my data pre-processing, I am only processing the customer review by setting it as nominal to text.
Could you please advise what I am missing here?
thanks.
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts