🎉 🎉. RAPIDMINER 9.8 IS OUT!!! 🎉 🎉
RapidMiner 9.8 continues to innovate in data science collaboration, connectivity and governance
Auto Model and overfitting
I've been experimenting with Auto Model for Prediction and am generally happy with the concept and results.
In the Auto Model process the sampling is set to 80/20. Is this sufficient to control potential overfitting? I am getting performance ranging from about 60% accuracy for Naive Bayes to 87% accuracy for GBT. I have less than 1000 rows of data and 20 attributes for each data set. GBT is generating about 20 trees. (I would potentially be operationalising with 100's of datasets and dedicated models per dataset)