Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

Decision tree

sshildermansshilderman Member Posts: 9 Contributor II
edited November 2018 in Help

I'm trying to use a decision tree to predict user will leave.

My data include 4 regular attributes (2 nominal, 2 integer), and 1 special attribute (nominal label).

When using the Decision Tree operator I don't get a tree with all data, only one of the regular appear (as root) and the leafs contains the label data (which is OK).

 

What am I doing wrong?

Tagged:

Answers

  • bhupendra_patilbhupendra_patil Employee, Member Posts: 168 RM Data Scientist

    Hello, this may be simply happening because the data does not have patterns that fit the criteria you set.

     

    I will suggest trying values for pruning, prepruning and confidence values.

     

    A better way to find a right value for these would be using the "Optimize Parameters (Grid) operator and giving it a range to try combinations of some of these variables that affect your model.

     

    You should be able to see a sample process in the help for "Optimize Parameters(Grid)" to see how this operator works

     

    Good Luck

  • sshildermansshilderman Member Posts: 9 Contributor II

    Followup question -

     

    First of all, thank you for your answer.

    I created a table with patterns (manually), first to check i'm doing it right.

     

    Is there a way to know who is located in each leaf?

    I would like to learn which users will have a specific value (the labell value) in the future.

     

    Bests. 

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,529 RM Data Scientist

    Hi,

     

    what you can do is use the tree to rules operator. As a result (see attached process) you get the paths as strings. That might be helpful in first place. There is no one operator solution to apply this rules to a dataset to get "leaf IDs" but it might be possible to find some working process with things like Write as Text and then parse the resulting text files.

     

    Best,

    Martin

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="7.1.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="7.1.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="retrieve" compatibility="7.1.001" expanded="true" height="68" name="Retrieve Golf" width="90" x="112" y="85">
    <parameter key="repository_entry" value="//Samples/data/Golf"/>
    </operator>
    <operator activated="true" class="tree_to_rules" compatibility="7.1.001" expanded="true" height="82" name="Tree to Rules" width="90" x="246" y="85">
    <process expanded="true">
    <operator activated="true" class="parallel_decision_tree" compatibility="7.1.001" expanded="true" height="82" name="Decision Tree" width="90" x="45" y="34"/>
    <connect from_port="training set" to_op="Decision Tree" to_port="training set"/>
    <connect from_op="Decision Tree" from_port="model" to_port="model"/>
    <portSpacing port="source_training set" spacing="0"/>
    <portSpacing port="sink_model" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="apply_model" compatibility="7.1.001" expanded="true" height="82" name="Apply Model" width="90" x="380" y="85">
    <list key="application_parameters"/>
    </operator>
    <connect from_op="Retrieve Golf" from_port="output" to_op="Tree to Rules" to_port="training set"/>
    <connect from_op="Tree to Rules" from_port="model" to_op="Apply Model" to_port="model"/>
    <connect from_op="Tree to Rules" from_port="example set" to_op="Apply Model" to_port="unlabelled data"/>
    <connect from_op="Apply Model" from_port="model" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    <portSpacing port="sink_result 3" spacing="0"/>
    </process>
    </operator>
    </process>
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.