Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

question about one of the tutorials

wclasterwclaster Member, University Professor Posts: 43 University Professor
In the first tutorial, called "Getting Started". In section 4/5, after building the simplest of decision trees on the Titanic Training data set, after performing the process, in the tutorial it says
       "Great job! Your process should now be complete and deliver a decision tree model, which explains to you what most of the survivors and most of the victims had in common."     

Actually, I don't see exactly what they are referring to. Just from looking at that tree (see attachment), what do they mean by
what the survivors (or victims) have in common?
I don't exactly know how the are reading the tree to say that. Apparently it is obvious since they did not explain it but I don't follow. THANK YOU.

Best Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,529 RM Data Scientist
    Solution Accepted
    i think we had a change of a default parameter. If you set min_gain to 0.1 instead of 0.01 you should get a smaller tree which is easier to understand.
    Best,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,529 RM Data Scientist
    Solution Accepted
    Hi,
    I don't know the resulting tree. But your interpretation sounds reasonable. Be sure that each level is an additional AND.
    Best,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany

Answers

  • wclasterwclaster Member, University Professor Posts: 43 University Professor
    Thank you! That definitely helps. 
    In this case, would you say part of the answer would be something like

    (Low number of parents, children, siblings, spouses) AND high passenger fare, implies survive?
    There is actually another yes leaf which has more data which I guess could be described as
    Passenger fare high AND (parents, children)<3.5 AND (siblings,spouses)<2.5 , implies survive.

    Thanks again!

Sign In or Register to comment.