ANNOUNCEMENT: RAPIDMINER 9.1 BETA HAS BEEN RELEASED TODAY!   PLEASE DOWNLOAD AND GIVE FEEDBACK. ENJOY AND HAPPY RAPIDMINING!   -- @sgenzer – Community Manager

Achieved decent accuracy with random dep variable values

tkaisertkaiser Member Posts: 8 Contributor I
edited November 10 in Help

I had a gradient boosted tree classification model, generated using the Auto Model, that produced a 70% f-measure for a given dependent variable value…but then I input random numbers for the dependent variable and ran a GBT model again, with the same exact example data, and the f-measure was 65%. So closer than I had expected, and wondering how that can be the case. Thank you. 

Answers

  • mschmitzmschmitz Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 1,801  RM Data Scientist

    Hi,

     

    what is the f-measure if you run a Default Model operator?

     

    BR,
    Martin

  • tkaisertkaiser Member Posts: 8 Contributor I

    Sorry, but I am not sure where that would go in the auto model process. 

     

    And I have now uncovered a second perhaps more pressing problem. The auto model ran a 3 fold cross validation, thus validating the future accuracy of the predictive model, guarenteeing there is no overlap between training and test sets. F-measure was about 70%, accuracy 90%. But then i did a manual hold-out - essentially giving 90% of my original data set to the auto model (GBT again), and then testing the model on the 10% holdout data. Performance was a little lower, but close to original performance measures. But when i applied the hold-out test set, the model performed terribly. Would very much appreciate some guidance as I have now lost confidence in my model's ability to predict future data.        

Sign In or Register to comment.