Options

question about one of the tutorials

wclasterwclaster Member, University Professor Posts: 43 University Professor
In the first tutorial, called "Getting Started". In section 4/5, after building the simplest of decision trees on the Titanic Training data set, after performing the process, in the tutorial it says
       "Great job! Your process should now be complete and deliver a decision tree model, which explains to you what most of the survivors and most of the victims had in common."     

Actually, I don't see exactly what they are referring to. Just from looking at that tree (see attachment), what do they mean by
what the survivors (or victims) have in common?
I don't exactly know how the are reading the tree to say that. Apparently it is obvious since they did not explain it but I don't follow. THANK YOU.

Best Answers

  • Options
    MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,507 RM Data Scientist
    Solution Accepted
    i think we had a change of a default parameter. If you set min_gain to 0.1 instead of 0.01 you should get a smaller tree which is easier to understand.
    Best,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • Options
    MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,507 RM Data Scientist
    Solution Accepted
    Hi,
    I don't know the resulting tree. But your interpretation sounds reasonable. Be sure that each level is an additional AND.
    Best,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany

Answers

  • Options
    wclasterwclaster Member, University Professor Posts: 43 University Professor
    Thank you! That definitely helps. 
    In this case, would you say part of the answer would be something like

    (Low number of parents, children, siblings, spouses) AND high passenger fare, implies survive?
    There is actually another yes leaf which has more data which I guess could be described as
    Passenger fare high AND (parents, children)<3.5 AND (siblings,spouses)<2.5 , implies survive.

    Thanks again!

Sign In or Register to comment.