Compete in RapidMiner's 3rd Competition: Fantasy Football. Top prize is $750. Deadline December 19.
Download RapidMiner Studio or Server 8.0 Public Beta. Let us know how you like it! Ends November 27.
Watch RapidMiner's "Getting Started" videos on YouTube. Everything you need to do data science - fast and simple!
Is there any way you think I can do the multiclass classification for this problem, I meant to say using all the five traits to measure the performance?
It would be really helpful if you do have any suggestion regarding that!
I haven't looked at this recently and I'm on another machine right now. The only wisdom I can impart to increasing the accuracy of the multiclassification results is to look at the pruning and tokenization. I worked on project a while back where the generic tokenizaiton wiped out a lot of information, so we changed the tokenization parameter and inserted the characters we wanted to break the words on. Just doing that had an huge lift in accuracy and results.
For Twitter if you use the default tokenization you'll lose hashtags real fast like #JLOisHOT, which could contain lots of information for your text processing work. Also, Tweets contain lots of http://t.co type of links that would get destroyed too. You might want convert http strings to a work like "link" too.
After you do that, then take a sample and use Optimize Parameters to adjust your pruning parameters, from there you might be able to see the direction you need to go to get a better multiclassifcation accurcay.