Decision Tree depth and Parameter Optimisation

skewed_analysisskewed_analysis Member Posts: 1 Newbie

I'm running parameter optimisation on a Decision Tree, however, the optimal parameters always output a higher depth value than the actual depth of the decision tree.

My flow is as follows: 
1. Optimize Parameters (Grid) operator
2. Cross Validation inside the Optimize Parameters (Grid)
3. Decision Tree inside the Cross Validation (testing with Apply Model & Performance Binomal Classification)

I have pointed the selected parameters to be those of the correct decision tree (#3 above). The only parameter for optimisation is the Depth with values: min 1, max 20, steps 30.

For example, the optimization output gives me a max depth of 9, whereas inspecting the Decision Tree model output I can see that it only has a depth of three.

Am I making a mistake somewhere or can the optimal maximum depth of a decision tree be higher than the actual node depth that is used?



  • yyhuangyyhuang Administrator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 363 RM Data Scientist
    Hi @skewed_analysis,

    Thanks for reporting that findings. If you disable both pruning and pre-pruning, the optimized tree depth should get closer to the actual depth. The max tree depth will limit the size of the tree that prevent overfitting. The final tree model could not have depth >= max_tree_depth

    PS, you do not need to have 30 steps for an integer iterate from 1 to 20 ;)

Sign In or Register to comment.