RapidMiner 9.8 Beta is now available
Be one of the first to get your hands on the new features. More details and downloads here:
RM 9.1 feedback : Auto-model documentation
I see that cross validation is now used to evaluate the performance of models in Auto-Model.
I see that the performance associated to the optimized model (calculated via a 3 - folds CV on the whole training set -by defaut 60% of the dataset- ) is different of the performance of the model delivered by the Performance average (Robust) operator (calculated via a (7-2 = 5 -by default- folds)CV on the test set - 40 % of the dataset). I think that this principe of evaluation of the performances must be explained in the documentation of Auto-Model (in the documentation of the "results" screen). Moreover the actual documentation is out of date :
Generally, I think that these elements are important and must be read and understood by the user.
I have a subsidiary question about Auto-Model :
Why the data sampling is different according th the used model, for example :
NB ==> max 2000000 examples
SVM ==> max 10000 examples ?
Thank you for your attention,