🎉 🎉 RAPIDMINER 9.10 IS OUT!!! 🎉🎉
Download the latest version helping analytics teams accelerate time-to-value for streaming and IIOT use cases.
Decision Tree Pruning
1. RM supports pessimistic pruning (i.e., top-down), but not optimistic pruning (i.e., bottom-up). Is that correct?
2. What are the precise logical steps in the pruning process?
3. When the Decision Tree is being trained using the "training set" with the pruning option enabled, which "validation set" is the classification error computed for? It cannot be the entire training set because then the classification error would be 0 in the fully-grown tree, which would always be the minimum. My understanding of pruning is that the cost complexity is computed by applying a penalty factor for tree size and the tree that minimizes the classification error for the validation set is chosen. When using the training set, how is the validation done?
I get related ideas from this previous post and another previous post. Also, I have looked at the RapidMiner code PessimisticPruner.java, but I am not able to parse the logic from there.
@IngoRM, @land, and others - any help would be much appreciated.