"path dependent Decision tree??"

adfedsadfeds Member Posts: 2 Contributor I
edited June 2019 in Help
So, I have test test data for a number of students. I have the data stored across columns like ID, Test1-score, test2-score, test3-score, and finally the label attribute "Pass Final" which is a binomial (1 or 0). What I am looking to do is force the tree to evaluate the liklihood of passing the final based on how the student did on test 1, and then look at test 2 in relation to test 1 and the final and so on. I've tried ordering attributes, selecting attributes and a number of other things, but I can't seem to find out how to create a path that the tree must follow. It is important because we need to identify at which stage of the students' matriculation through our program should we take them aside for remediation.

Any help will be greatly appreciated. I've also tried to turn the data into a time series list-type or flat-file dataset, but with ID, Test, Date & Score, but Rapid Miner starter edition is choking on the size of the data and it will not load. Thanks in advance.

Allen
Tagged:

Answers

  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,249 RM Data Scientist
    Okay, i don't get this.

    You have trained a decision tree. You apply it, and see if you got ones or zeros. Now you want to know why there is a 1 or a 0?
    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • adfedsadfeds Member Posts: 2 Contributor I
    No, the decision trees are looking at, say, test 4, prior to test 1.Β  Maybe this is simply not done, but I need it to evaluate the tests in order in terms of their liklihood to pass. I'm admittedly a rapidminer noob, so perhaps this type of thing is not done, but I would think their has to be a way to evaluate the liklihood of an outcome based on a a series of gates or tipping points.
  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,249 RM Data Scientist
    Maybe it is just me, but your terminology is really different from standard rapidminer terminology. Sounds like Log-Likelihood-Methods for me.

    Anyway - Can't you take the confidences of your tree? They are normalized between 0-1 and might be interpreted as a likelihood.
    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.