I am seeking a little "best" advice on the live prediction model application, as I am a little confused what approach is normally adopted.

The data : My data set is 50 attributes and 3400 rows ( 90% for training, 10% for unseen testing) with the very last row reserved as the live prediction example.

The training : I use the 90% training data in 10 fold x-validation to find the best training algorithm and attribute mix for my data. Confirming the best setup selection by applying the model created on the 10%  of unseen data.

My question is - Once I am happy with the above results, what model do I use ( or create ) for the live prediction of the last row? :

1) Do I use the best model created via 90% data 10 fold x-validation
2) Do I create a model  with the 90% training data ( without x fold ) using the best settings found from the x-validation  training.
3) Do I create a model on 100% data ( 90% training and 10% unseen )  with the best settings found from training.

Thank you in advance for your time.


    With datasets that small my advice would be to go with (1) select based on X-validation.

    With a large dataset you could go with (2) select based on training/test. You can do without X-validation here.

    Whatever you pick Don't do (3) ever as you face the risk of over-fitting the data badly.

    There are some authors who recommend splitting the dataset into training/test/validation. Train your models in the training set. Compare the models in the test set. Pick the best. Estimate the error rate of the best model again in the validation set.
    Thanks for the quick response earmijo.

    Can I just confirm - you are advocating using the "best" model created by the 10 fold x-validation method, and not retraining the model using the "best" model settings but on the complete data set.
    The way X-Validation works in RapidMiner is you use X-validation to estimate the "out-of-sample" error but you report the model trained on the entire dataset. Notice, for instance, when you use 10-fold X-validation the model is estimated 11 times.
    Fantastic - I understand - Thank you for your advice.
    I have a question. Do you apply the trained model with the model applier right after the XValidation or do you have to train again over the whole training set after having applied the XValidation? I am asking because in case you do a Feature selection with an inner XValidation, You don't get a model out of the feature selection (there is no connection point). However you could save the model with a "remember operator" inside the FS and call the model outside the FS operator and combine it with the feature weights operator for the unseen testset. But I think one has to retrain over the full training set with the selected features right?
