Is it logical that testing error be lower than training error?

njasajnjasaj Member Posts: 18 Contributor II
edited November 2018 in Help
Hi Rapidminer Community,
I used SVM (Libsvm) operator for making a regression model. After training by 10 fold cross validation the resulted  correlation coefficient was  84 and RMSE  was 0.048. By applying this model on the test data set i got correlation coefficient of 88.5 and RMSE of 0.037. Now i need to know is it possible or logical that testing error be lower than training error?
Thanks.

Answers

  • frasfras Member Posts: 93 Contributor II
    Hi,
    yes, this is possible. You have to keep in mind that using only one testset does
    _not_ deliver representative results. That's why we have Cross-Validation where
    averaging over more than one testset is done. So trust Cross-Validation for choosing the
    right SVM-parameters and finally train your model on full data.
    Cheers, Frank
  • njasajnjasaj Member Posts: 18 Contributor II
    Thanks for your reply. Would you mind explain a bit more about this? I think that cross validation is only for train data set and after finding the model parameters just apply the model on test data set. As i understand you recommend use cross validation on test  data ? So what happen for the other part of data which was splitted by CV for training?
    Thank you.
Sign In or Register to comment.