Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
Sentiment Analysis
crimson_crow
Member Posts: 3 Learner II
in Help
Hello! I`m a new one to RapidMiner and I want to learn Sentiment Analysis for my coursework. The purpose is to build a model which can estimate what reviews are: positive, or negative. In program there is an example of the process, but I want to change a couple of things:
1. Replace an example set with my own which has more data
2. Instead of a document with only one review to be estimated by a model I want to use a .xlsx file with reviews which I parsed from IMDb site.
The problems are in "Cross Validation" operator in the screenshot "First Problem", and in "Read Document" operator on the screenshot "Second Problem".
I can`t understand why "Cross Validation" operator has the problem of type because my data has the same structure as in the example, and what operator should I use to read parsed data in .xlsx file correctly?
1. Replace an example set with my own which has more data
2. Instead of a document with only one review to be estimated by a model I want to use a .xlsx file with reviews which I parsed from IMDb site.
The problems are in "Cross Validation" operator in the screenshot "First Problem", and in "Read Document" operator on the screenshot "Second Problem".
I can`t understand why "Cross Validation" operator has the problem of type because my data has the same structure as in the example, and what operator should I use to read parsed data in .xlsx file correctly?
Tagged:
0
Best Answer
-
lionelderkrikor RapidMiner Certified Analyst, Member Posts: 1,195 UnicornHi @crimson_crow,
Thanks for sharing your process and data.
You have to :
- Apply the same pre-processing step(s) in your training branch and in your scoring branch, thus put a Nominal to Text operator (you don't need a Read Document operator) in your score branch.
- Set a Process Document from Data in your scoring branch (like in your training branch)
- Simplify your Cross Validation operator : I just use a SVM model in the training part and use an Apply Model and a Performance (Binominal Classification) in the test part.
In attached file, the working process.
Regards,
Lionel7
Answers