Difference between normal decision tree with information gain criterion and W-J48

koknayayakoknayaya Member Posts: 20 Contributor I
edited January 15 in Help
Hi. Have a good day everyone!

I want to ask a question smiley 

1st question
What is the difference between 
a) normal decision tree with information gain criterion and
b) W-J48?

Im quite confused with the difference. 

Why dont we just use the basic decision tree and choose 'Information gain' for the criterion instead of using W-J48?

2nd question
Is there any guidelines for me to set the suitable values for parameters in W-J48 such as the confidence threshold for pruning and the minimum number of instances per leaf? 

I dont know the suitable value that should be set for the parameters.

Answers

  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,277   Unicorn
    I think this is a similar discussion to the following thread: https://community.rapidminer.com/discussion/54804/difference-between-c4-5-and-w-j48#latest
    For more details on the W-J48 implementation you should consult the Weka project documentation.
    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
    sgenzer
  • varunm1varunm1 Moderator, Member Posts: 964   Unicorn
    edited January 15
    Hi @koknayaya

    I dont see much difference conceptually between these two as they both use same concept. Information gain ratio and J48 both are worked by Quinlan. Actually both works based on Pruning confidence which is denoted as 'C' and minimal leaf size 'M'. You can see both options in both decision trees. 

    For your question 2, I see that the default values for confidence 'C' is 0.25 and 'M is '2'. If the confidence is lower the tree is pruned more. You need to try different combinations

    Thanks,
    Varun
    sgenzer
Sign In or Register to comment.