Difference between normal decision tree with information gain criterion and W-J48

koknayayakoknayaya Member Posts: 20 Contributor I
edited January 2019 in Help
Hi. Have a good day everyone!

I want to ask a question smiley 

1st question
What is the difference between 
a) normal decision tree with information gain criterion and
b) W-J48?

Im quite confused with the difference. 

Why dont we just use the basic decision tree and choose 'Information gain' for the criterion instead of using W-J48?

2nd question
Is there any guidelines for me to set the suitable values for parameters in W-J48 such as the confidence threshold for pruning and the minimum number of instances per leaf? 

I dont know the suitable value that should be set for the parameters.

Answers

  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    I think this is a similar discussion to the following thread: https://community.rapidminer.com/discussion/54804/difference-between-c4-5-and-w-j48#latest
    For more details on the W-J48 implementation you should consult the Weka project documentation.
    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • varunm1varunm1 Moderator, Member Posts: 1,207 Unicorn
    edited January 2019
    Hi @koknayaya

    I dont see much difference conceptually between these two as they both use same concept. Information gain ratio and J48 both are worked by Quinlan. Actually both works based on Pruning confidence which is denoted as 'C' and minimal leaf size 'M'. You can see both options in both decision trees. 

    For your question 2, I see that the default values for confidence 'C' is 0.25 and 'M is '2'. If the confidence is lower the tree is pruned more. You need to try different combinations

    Thanks,
    Varun
    Regards,
    Varun
    https://www.varunmandalapu.com/

    Be Safe. Follow precautions and Maintain Social Distancing

Sign In or Register to comment.