08-11-2017 02:34 AM
The default setting (10) has been a consensus for a long time.
Depending on your data and the stability of your models, you could get away with less or need more.
Try different values and look for the variance of both the main performance number and the calculated variance. if these stay stable, you have enough data and stable enough models so you can go with less iterations.
08-11-2017 08:32 AM
I agree with @BalazsBarany that 10 folds is the default consensus, but with large datasets, you can usually get away with 5. As noted, stability of the performance is the key measure. If you have a small dataset you might consider the leave-one-out option but for larger datasets it is not at all recommended.
08-14-2017 06:15 AM
I agree and if you need a reference for that, then see: Ron Kohavi, A study of cross-validation and bootstrap for accuracy estimation and model selection, Proceedings of the 14th international joint conference on Artificial intelligence, p.1137-1143, August 20-25, 1995, Montreal, Quebec, Canada.
08-15-2017 06:55 AM
The choice of k is an example of the Bias-Variance trade-off present in every estimation.
The Leave-One-Out CV is the most unbiased one, but it can have a very high variance (the models trained using the same dataset but one point are highly correlated).
The CVs with decreasing value of k will tend to be more biased (overestimating) but with lower variance.
In practical terms, if the estimation of the model performance is very important you can do several CV with k ranging from 5-20, and then choose the one that has the maximum acceptable variance. If the estimation is not very important (i.e. is used only for feature selection or parameter optimization), then you can leave it at 10, or reduce to 5 if you need to do it fast.