"Sentiment Analysis - Numerical Labels, and the search for the right Process"
I have got a question again which might be easy to answer for those of you who already played around with the Sentiment Analysis qualities of Rapidminer. On the one hand I have a collection of thousands of documents where i extracted the information I need and compiled a matrix with the concerning T-IDF scores of expressions appearing in the documents. On the other hand I have a matrix with words which also contains a certain sentiment score between 0 and 1 attributed to each word. The question is now how to bring these two strings together to measure the sentiments reflected in the documents over time. The idea now is match the T-IDF matrix with the word/sentiment score matrix. Or more precisely, I want to look which expressions of the sentiment matrix also appear in the concerning documents and weight them with the respective IDF values. Is there a process which does this? I tried to go along the example described here http://rapid-i.com/rapidforum/index.php/topic,2993.0.html and the classification approach presented in the Vancouver Data Blog Video Tutorial 5 but it seems that the problem hinges on the fact that the Learning Processes don't accept numerical labels. Could somebody give me a hint? I would really appreciate that!