"Text Scoring Via Word Tagging"
I was wondering if the text plugin could be configured to use an external list of words tags to score documents ; In this specific case, the Harvard-IV dictionary which is used by the General Inquirer program which is perhaps one of the oldest text sentiment extraction tools. The web java version of the said program can be found here http://www.webuse.umd.edu:9090/
The Harvard-IV dictionary contains several categories which are scored from each text document according to the term frequency of the words which appear in each category.
Positive 1045 positive words, Words such as good, rose, happy will go towards the Positive score
Negative 1160 negative words, Words such as bad, fell, sad will go towards the Negative score
Is there a way to get Rapid miner to handle this ? I have used the dictionary stemmer, but it seems inaccurate (significantly different results from the General Inquirer scores using the same text)