Hi! Regarding stability of decision trees, I've been generating some C4.5 (10 attributes, 4000 instances) with for example 81,35% accuracy +/- 1,93 in 10- fold cross validation ( good models for me) But when I delete some of the training instances (about 10 for example) and I re-generate the model I get a different tree and as I've been reading that's because of the ("well known") problem of Instability of decision trees. In spite of being a well know problem I could not find out how to study it using a formal approach (sampling? study the variance of 10 fold cross validation ) and could not find out either how to overcome it.
Please could anyone give me any hint about how to deal with this problem?