Confusion Matrix for Analyze Sentiment Operator

jdude35jdude35 Member Posts: 6 Contributor I
edited June 2019 in Help

Hello. I have not used RapidMiner for a while, so please forgive the noob question. I am trying to recreate a RapidMiner model that I had read in an academic journal. The model uses the Aylien "Analyze Sentiment" operator to do sentiment analysis of a movie review dataset. I need to generate a confusion matrix for this model. I tried to insert a performance operator, but it kept giving me an error saying it was expecting a label. Each item in the dataset comes with a "polarity" label, so I don't know what I am not doing correctly. Any suggestions?

 

aylien1.jpgaylien2.jpg

Answers

  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn

    Yes, any performance operator for models requires a label to run.  The polarity from the operator is the prediction, not the actual label.  To generate a confusion matrix, you need to have an independent label, so it can compare the prediction against the actual outcome.  Unless you hand label the texts you are running through the Aylien API, you are not going to be able to generate a confusion matrix even if you pretend the Aylien prediction is the actual label, because one half of the matrix will be empty (or the prediction will always match the actual, which is a useless output).

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • SamiRamiSamiRami Member Posts: 5 Contributor I
    How could I generate the confusion matrix in rapid miner ? What is the process name ? And how could I feed the actual and the predicted classes to this process ? 
  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    Take a look at the Tutorial Process for the Cross Validation operator.  This shows how to build and validate a model correctly and then output performance, including a confusion matrix.

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    This video and the videos in the related items at the bottom may be helpful as well:
    Best,
    Ingo
  • SamiRamiSamiRami Member Posts: 5 Contributor I
    Thanks Ingo for replying to my question. 

    Actually these set of videos related to the validation of the classification model to split the data into training and testing sets. 

    I am talking about the unsupervised   clustering process (but for the sake of evaluating the model the ground truth classes are given. i.e, it will apply after applying the clustering model for evaluation purposes). Moreover, the actual and predicted classes are multi not binary clases. 

    Any help I will appreciate ...

    Regards

  • kypexinkypexin Moderator, RapidMiner Certified Analyst, Member Posts: 291 Unicorn
    Hi @SamiRami

    It could be easier to help you if you could share here actual dataset on which you want to produce confusion matrix.
Sign In or Register to comment.