Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
Cross Validation operator model output
Hello,
Trying to understand the model output of the cross-validation operator. For example, in a 5 fold cross validation 5 models are trained and tested, so does the cross-validation operator outputs last folds model or the best model from 5?
@mschmitz @lionelderkrikor
Thanks,
Varun
Trying to understand the model output of the cross-validation operator. For example, in a 5 fold cross validation 5 models are trained and tested, so does the cross-validation operator outputs last folds model or the best model from 5?
@mschmitz @lionelderkrikor
Thanks,
Varun
Regards,
Varun
https://www.varunmandalapu.com/
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing
Tagged:
0
Best Answers
-
IngoRM Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM FounderHi @varunm1,The model is built on the complete input data. This is just a convenience feature. There is simply no best model and the whole point of the cross validation is to estimate how well a model trained on the full data will perform (so the validation of it, not the model selection). Search here on the community if you want to learn more about, there have been a couple of discussions already in the past, e.g. this one here: https://community.rapidminer.com/discussion/52657/cross-validationHope that helps,Ingo7
-
lionelderkrikor RapidMiner Certified Analyst, Member Posts: 1,195 UnicornHi @varunm1,
If I good remember, for a N-fold CV, RapidMiner performs N+1 iterations.
So in practice, in your case, for a 5-folds CV, 5 models are trained and tested to obtain the average performance.
Then a 6th iteration is performed and a model is builded from the whole training set. It is this model which is supplied by the model output port of CV operator and it is this model which is associated to the confusion matrix of the Performance operator.
To convince you, you can set a breakpoint after in the model inside the CV operator :
Hope this helps,
Regards,
Lionel
NB : Thanks to the experts to correct me if I'm wrong in my explanation...
7
Answers
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing