"Sentiment Prediction through Text Mining"

BraulioBraulio Member Posts: 7 Contributor II
edited May 2019 in Help
Hi there!

We would like to start building a sentiment predicting system. That system should be able to predict market humor about companies/products.
I know this is a complex task and I would like to get some tips about building such a system using Rapid Miner.

Would Text Classification with Cross Validation be suitable to separate good from bad news about a company X?

Do you recommend any books in this field?

Thank you very much and hope to get things going soon.

Braulio

Answers

  • TobiasMalbrechtTobiasMalbrecht Moderator, Employee, Member Posts: 294 RM Product Management
    Hi Braulio,

    We would like to start building a sentiment predicting system. That system should be able to predict market humor about companies/products.
    I know this is a complex task and I would like to get some tips about building such a system using Rapid Miner.
    [/quote]

    Well, building such a system is certainly possible with RapidMiner. In fact, RapidMiner is a very suitable tool for such tasks, as despite its wide coverage of data mining / bi / etl approaches it is highly intuitive and can therefore be excellently used for what we call Rapid Prototyping which means to set up initial data mining processes easily and in a relatively short time. Due to the fact that RapidMiner can also be easily integrated into your applications, you can build a full system around the processes you just set up.
    Braulio wrote:

    Would Text Classification with Cross Validation be suitable to separate good from bad news about a company X?
    Yes, learning a text classification model would be an approach to separate good news from bad news. This of course means, you have to supply (or crawl from the internet) text passages which you can present to the learner to make it able to learn a model.

    Coming to the second part, a cross validation builds a mandatory part of almost all supervised learning tasks in which the performance of a model should be validated. Hence, it is suitable and recommendable in your application area as well, if you want to know how good your models can separate the good from the bad news.

    Hope that helps,
    Tobias
  • BraulioBraulio Member Posts: 7 Contributor II
    Hi Tobias and thanks for the post.

    I am just wondering how the sentiment could be predicted at an entity level. One thing is to get an overall sentiment about a text. Another thing (completely different) is to get different sentiments about different entities. In the same text, there can be a positive sentiment about a company X on the first paragraph and a bad sentiment about company Y on the second.

    Any tips on how to handle such a task with Rapid-i would be greatly appreciated.

    Thank you very much

    Braulio
  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi,

    you could try to segment the data with the corresponding operators and see if the segments get different sentiment classifications.

    Cheers,
    Ingo
  • BraulioBraulio Member Posts: 7 Contributor II
    mierswa wrote:


    you could try to segment the data with the corresponding operators and see if the segments get different sentiment classifications.

    How would you segment that Ingo? By each paragraph/phrase containing the targeted entities?

    Thanks a lot.

    Unfortunatelly I could not get to Dortmund to attend the seminars this week.

    I am building a team to work in this area in Brazil (portuguese language) and I will certainly need some partners that have already the expertise in building such systems. Hope we can get in touch.

    Vielen Dank ;)

    Braulio Medina

  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi,

    the text plugin contains an operator called "Segmenter". It it a bit hard to configure but it should do the trick. Otherwise, your could crawl and / or segment the data yourself and let RapidMiner do the mining stuff.

    Unfortunatelly I could not get to Dortmund to attend the seminars this week.
    Yes, it's a shame - but I am sure we can get in touch later.

    Cheers,
    Ingo
  • BraulioBraulio Member Posts: 7 Contributor II
    mierswa wrote:


    the text plugin contains an operator called "Segmenter".
    The Text Plugin is GREAT, or, as I would say in German, Hammerhart

    I will work at the Text Segmenter.

    Thanks a lot
Sign In or Register to comment.