RapidMiner 9.8 Beta is now available

Be one of the first to get your hands on the new features. More details and downloads here:

GET RAPIDMINER 9.8 BETA

Best way to spot examples in testing set that receive a wrong classification?

maverikmaverik Member Posts: 10 Contributor I
edited November 2019 in Help
Hello! I have a dataset of 486 examples, 53 attributes including a binominal target attribute (0, 1). I use 80% for training and 20% for testing. In the X-validation operator, the training part contains the Decision Tree operator inside of the Bayesian Boosting operator; the testing part contains the Apply Model operator connected to the Performance operator.

With Decision Tree alone, I have about 64% correct prediction for the testing set. With Bayesian boosting, I have about 79% correct prediction. In the result section, I can see a green-colored column indicative of the prediction for the target attribute for all 486 examples.

My question are:
1. Is there a reason that the predictions shown are for all examples, rather than for the testing examples only?

2. what's the best way to spot and isolate the examples that are incorrectly predicted?

Many thanks!

Tagged:

Answers

  • frasfras Member Posts: 93 Contributor II
    Using X-Validation operator does not deliver predictions at all. You
    may replace X-Validation by X-Prediction (without using performance
    operator) then you get "realistic" predictions in the result
    perspective. There you may choose "wrong_predictions" to "spot and
    isolate the examples that are incorrectly predicted".
  • maverikmaverik Member Posts: 10 Contributor I
    Thank you fras! This indeed solves my problem. Could you also advise on whether there is a similar operator for "split prediction" as I was not able to find one?
Sign In or Register to comment.