"in rapidminer how do you select the attributes to be within a decision tree"

simsim Member Posts: 18 Learner I
edited May 23 in Help
Hi, 

I'm trying to create a decision tree within rapidminer. How would I select the attributes to be within this decision tree?

So far I only have two out of 11 attributes being accounted for. Ideally I would like to have around 4-5 of these attributes to be included within the decision tree. I have selected a label- however the attributes assigned as the label seems to influence how many attributes are incorporated. Is this normal?
Tagged:

Best Answer

Answers

  • varunm1varunm1 Member Posts: 666   Unicorn
    Hi @sim

    As you are trying to classify labels, decision tree does remove attributes (Pruning) that are not relevant for prediction or some times less relevant compared to other attributes. This is normal. As a supervised learning algorithm, there will be an impact of labels on tree structure as its training based on the attributes and labels.

    If you think that its highly dependent on one attribute, you can check if the variables is highly correlated with the output.

    Regards,
    Varun
    sgenzersim
  • simsim Member Posts: 18 Learner I
    Hi Varun, 

    Thank you for your response! I'm trying to predict the likelihood of a student passing an exam with influencing attributes, such as whether they have prepared for this exam. I'm a little unsure on how I should go about obtaining this within a decision tree. So far I just have node which predicts the likelihood of a student passing their exams with "pass" and "fail" being the leaves. I have allocated the attribute pass mark as the leaf, hoping that this will be the attribute that is predicted, however this does not appear to be the case.
    Is there anything that I can do to fix this? 
  • varunm1varunm1 Member Posts: 666   Unicorn
    Hi @sim

    If possible, can you provide XML code and sample data so that I can take a look?

    Regards,
    Varun
  • simsim Member Posts: 18 Learner I
    edited January 2
    gjh
  • simsim Member Posts: 18 Learner I
    Hi Varun, 

    Once again, thank you for such as quick response! I'll try that- in case it does not work is there anything else that I can do?
  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,213   Unicorn
    In your case, based on your description, you want to set the attribute that records whether the student passes the exam or not as the label using the Set Role operator.  This is the quantity the decision tree will try to predict.  The other attributes it will use to try to make the prediction.  The DT algorithm will select as many as are helpful for making that prediction.  To begin you can run it with both pruning and prepruning off.  This isn't recommended for a final model but it will show you the most optimistic tree possible.  If it doesn't select any attributes at all even with no pruning or prepruning, that means that it can't find any attributes that help separate the two classes (pass vs fail) that you are trying to predict.  Take a look at the DT operator tutorial if you aren't sure about how to wire the operators correctly.  Without seeing your process or your data it will be hard to be any more specific than this.

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
    varunm1sgenzer
Sign In or Register to comment.