Decision Tree

RawanRawan Member Posts: 2 Newbie
Hi everyone,
 After building decision tree how can I know 
  1. Tree depth
  2.  How many leaves (decision nodes) does it have
Thnx

Jasmine_

Answers

  • BalazsBaranyBalazsBarany Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 955 Unicorn
    Hi @Rawan,

    check out Tree to Rules (included in RapidMiner) or "Decision Tree to Example Set" in the Converters extension.
    You can easily get the number of decision nodes from the number of rules.
    The tree depth is a bit harder. Essentially, the number of & signs in the description of the rule (with Decision Tree to Example Set) is the number of decisions that were applied.

    Here's an example using the Converters extension:
    <?xml version="1.0" encoding="UTF-8"?><process version="9.5.001">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="9.5.001" expanded="true" name="Process">
        <parameter key="logverbosity" value="init"/>
        <parameter key="random_seed" value="-1"/>
        <parameter key="send_mail" value="never"/>
        <parameter key="notification_email" value=""/>
        <parameter key="process_duration_for_mail" value="30"/>
        <parameter key="encoding" value="SYSTEM"/>
        <process expanded="true">
          <operator activated="true" class="retrieve" compatibility="9.5.001" expanded="true" height="68" name="Retrieve Iris" width="90" x="45" y="34">
            <parameter key="repository_entry" value="//Samples/data/Iris"/>
          </operator>
          <operator activated="true" class="concurrency:parallel_decision_tree" compatibility="9.5.001" expanded="true" height="103" name="Decision Tree" width="90" x="179" y="34">
            <parameter key="criterion" value="gain_ratio"/>
            <parameter key="maximal_depth" value="10"/>
            <parameter key="apply_pruning" value="true"/>
            <parameter key="confidence" value="0.1"/>
            <parameter key="apply_prepruning" value="true"/>
            <parameter key="minimal_gain" value="0.01"/>
            <parameter key="minimal_leaf_size" value="2"/>
            <parameter key="minimal_size_for_split" value="4"/>
            <parameter key="number_of_prepruning_alternatives" value="3"/>
          </operator>
          <operator activated="true" class="converters:dectree_2_example_set" compatibility="0.6.000" expanded="true" height="82" name="Decision Tree to ExampleSet" width="90" x="313" y="34"/>
          <operator activated="true" class="extract_macro" compatibility="9.5.001" expanded="true" height="68" name="Extract Macro" width="90" x="447" y="34">
            <parameter key="macro" value="nodes"/>
            <parameter key="macro_type" value="number_of_examples"/>
            <parameter key="statistics" value="average"/>
            <parameter key="attribute_name" value=""/>
            <list key="additional_macros"/>
          </operator>
          <operator activated="true" class="generate_attributes" compatibility="9.5.001" expanded="true" height="82" name="Generate Attributes" width="90" x="581" y="34">
            <list key="function_descriptions">
              <parameter key="nodes" value="eval(%{nodes})"/>
              <parameter key="depth" value="length(replaceAll(Condition, &quot;[^&amp;]&quot;, &quot;&quot;)) + 1"/>
            </list>
            <parameter key="keep_all" value="true"/>
          </operator>
          <operator activated="true" class="aggregate" compatibility="9.5.001" expanded="true" height="82" name="Aggregate" width="90" x="715" y="34">
            <parameter key="use_default_aggregation" value="false"/>
            <parameter key="attribute_filter_type" value="all"/>
            <parameter key="attribute" value=""/>
            <parameter key="attributes" value=""/>
            <parameter key="use_except_expression" value="false"/>
            <parameter key="value_type" value="attribute_value"/>
            <parameter key="use_value_type_exception" value="false"/>
            <parameter key="except_value_type" value="time"/>
            <parameter key="block_type" value="attribute_block"/>
            <parameter key="use_block_type_exception" value="false"/>
            <parameter key="except_block_type" value="value_matrix_row_start"/>
            <parameter key="invert_selection" value="false"/>
            <parameter key="include_special_attributes" value="false"/>
            <parameter key="default_aggregation_function" value="average"/>
            <list key="aggregation_attributes">
              <parameter key="depth" value="maximum"/>
            </list>
            <parameter key="group_by_attributes" value="nodes"/>
            <parameter key="count_all_combinations" value="false"/>
            <parameter key="only_distinct" value="false"/>
            <parameter key="ignore_missings" value="true"/>
          </operator>
          <connect from_op="Retrieve Iris" from_port="output" to_op="Decision Tree" to_port="training set"/>
          <connect from_op="Decision Tree" from_port="model" to_op="Decision Tree to ExampleSet" to_port="tree"/>
          <connect from_op="Decision Tree to ExampleSet" from_port="exa" to_op="Extract Macro" to_port="example set"/>
          <connect from_op="Extract Macro" from_port="example set" to_op="Generate Attributes" to_port="example set input"/>
          <connect from_op="Generate Attributes" from_port="example set output" to_op="Aggregate" to_port="example set input"/>
          <connect from_op="Aggregate" from_port="example set output" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="126"/>
        </process>
      </operator>
    </process>

    Regards,
    Balázs
    sgenzerJasmine_
  • RawanRawan Member Posts: 2 Newbie
    BalazsBarany 
    sorry but i couldn't find & signs after I used the tree to rules to find the tree depth :( 
    Jasmine_
  • BalazsBaranyBalazsBarany Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 955 Unicorn
    Hi @Rawan,

    that operator has a different output. There's no easy way to process the model description in RapidMiner, but you can copy and process it with an external tool like an editor. Tree to Rules separates the rules with "and".

    The process I put into the discussion (you can copy the XML into RapidMiner Studio in the XML panel) uses the Converters extension. This contains the helpful operator that converts the tree to an example set which you can analyze inside RapidMiner.

    Regards,
    Balázs
    Jasmine_
Sign In or Register to comment.