How to know ... in a tree model or tree to rules?

cypressproject0cypressproject0 Member Posts: 5 Contributor I
edited December 2018 in Help

Hello there.

I am new to using this tool and I would like you to help me with a doubt that has arisen.

 

The question is: Is it possible to know automatically which rule is sorting me an instance in the output of a tree model or a tree for rules?

 

In this case I have an entry with 200 000 examples and 80 attributes. The tree structure that generated the model tree has a depth of 8. So far so good, but now I need to know automatically which decision rule is being applied to classify a new instance.

 

Is this possible in rapidminer?

 

Thanks for helping

Best Answers

  • earmijoearmijo Member Posts: 270 Unicorn
    Solution Accepted

    You can also use the "Operator Toolbox" Extension. This extension has an operator named "Get Decision Tree Path" that will produce a new variable with the path (a text variable) you have to follow to reach the final node for each observation. Be aware that this extension requires the Text Mining extension. So make sure you load it too. 

     

    Here's the process:

    Screen Shot 2017-06-07 at 8.49.19 AM.png

    And here's the output:

    Screen Shot 2017-06-07 at 8.50.45 AM.png

     

     

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,503 RM Data Scientist
    Solution Accepted

    Hello,

     

    this concrete operator will work with a meta cost operator, because it's delivering a pure decision tree. It will not work out of the box with ensemble operators like bagging.

     

    By the way, in the converters extesion we will have a operator called Decision Tree to Example set. This converts the table you see in Description to an example set. I think it fits a bit better.

     

    Best,

    Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany

Answers

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    You could use both but the easiest is clicking on the Description for the Decision Tree learner.

    2017-06-07_10-08-50.png

     

     

     

     

  • cypressproject0cypressproject0 Member Posts: 5 Contributor I

    Thank you very much, this is just what I needed!

    Another question: do you think this tool can be applied to a metaclassifier like Metacost if the internal apprentice is a decision tree? If not, can you think of any other solution?

    Thank you very much again.

  • cypressproject0cypressproject0 Member Posts: 5 Contributor I

    Thank you very much, Thomas.

    This path is useful, but I needed something like the solution below. Although as I advance in this the matter is complicated. The model I am developing has a metaclassifier as a base that in turn has a tree of apprentice decision. Knowing the route or routes that the model is following to make a decision is of vital importance for the project in which I am. Can you think of a solution?

  • cypressproject0cypressproject0 Member Posts: 5 Contributor I

    Hello Martin. Thank you very much for your help but rapidminer reports me error since the metaclassifier does not deliver a pure tree. It would be a good solution if my model were generated by a pure tree. I am bogged down with this affair.

    A greeting.

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,503 RM Data Scientist

    Whoops, 

    my problem. MetaCost is of course also an ensemble model. Attached is a process with a small script converting the meta model into a collection of models and then transforming them into example sets. Note that does not take the application logic of MetaCost into account.

     

    Best,

    Martin

    <?xml version="1.0" encoding="UTF-8"?><process version="7.5.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="6.0.002" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="retrieve" compatibility="7.5.001" expanded="true" height="68" name="Sonar" width="90" x="45" y="34">
    <parameter key="repository_entry" value="//Samples/data/Sonar"/>
    </operator>
    <operator activated="true" class="split_validation" compatibility="7.5.001" expanded="true" height="124" name="Validation" width="90" x="246" y="136">
    <process expanded="true">
    <operator activated="true" class="metacost" compatibility="7.5.001" expanded="true" height="82" name="MetaCost" width="90" x="112" y="30">
    <parameter key="cost_matrix" value="[0.0 2.0;3.0 0.0]"/>
    <process expanded="true">
    <operator activated="true" class="concurrency:parallel_decision_tree" compatibility="7.5.001" expanded="true" height="76" name="Decision Tree (2)" width="90" x="313" y="30"/>
    <connect from_port="training set" to_op="Decision Tree (2)" to_port="training set"/>
    <connect from_op="Decision Tree (2)" from_port="model" to_port="model"/>
    <portSpacing port="source_training set" spacing="0"/>
    <portSpacing port="sink_model" spacing="0"/>
    </process>
    </operator>
    <connect from_port="training" to_op="MetaCost" to_port="training set"/>
    <connect from_op="MetaCost" from_port="model" to_port="model"/>
    <portSpacing port="source_training" spacing="0"/>
    <portSpacing port="sink_model" spacing="0"/>
    <portSpacing port="sink_through 1" spacing="0"/>
    </process>
    <process expanded="true">
    <operator activated="true" class="apply_model" compatibility="7.1.001" expanded="true" height="82" name="Apply Model" width="90" x="45" y="30">
    <list key="application_parameters"/>
    </operator>
    <operator activated="true" class="performance_classification" compatibility="7.5.001" expanded="true" height="82" name="Performance" width="90" x="179" y="30">
    <list key="class_weights"/>
    </operator>
    <connect from_port="model" to_op="Apply Model" to_port="model"/>
    <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
    <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
    <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
    <portSpacing port="source_model" spacing="0"/>
    <portSpacing port="source_test set" spacing="0"/>
    <portSpacing port="source_through 1" spacing="0"/>
    <portSpacing port="sink_averagable 1" spacing="0"/>
    <portSpacing port="sink_averagable 2" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="execute_script" compatibility="7.5.001" expanded="true" height="82" name="Execute Script" width="90" x="447" y="34">
    <parameter key="script" value="&#10;&#10;import com.rapidminer.operator.learner.meta.MetaCostModel;&#10;import com.rapidminer.operator.IOObjectCollection;&#10;import com.rapidminer.operator.learner.PredictionModel;&#10;&#10;MetaCostModel meta = input[0]&#10;IOObjectCollection&lt;PredictionModel&gt; io = new IOObjectCollection()&#10;&#10;for(int i = 0; i&lt; meta.getNumberOfModels();++i){&#10;&#9;io.add(meta.getModel(i))&#10;}&#10;return io&#10;&#10;&#10;"/>
    </operator>
    <operator activated="true" class="loop_collection" compatibility="7.5.001" expanded="true" height="82" name="Loop Collection" width="90" x="581" y="34">
    <process expanded="true">
    <operator activated="true" class="converters:dectree_2_example_set" compatibility="0.3.000" expanded="true" height="82" name="Decision Tree to ExampleSet" width="90" x="380" y="85"/>
    <connect from_port="single" to_op="Decision Tree to ExampleSet" to_port="tree"/>
    <connect from_op="Decision Tree to ExampleSet" from_port="exa" to_port="output 1"/>
    <portSpacing port="source_single" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    </operator>
    <connect from_op="Sonar" from_port="output" to_op="Validation" to_port="training"/>
    <connect from_op="Validation" from_port="model" to_op="Execute Script" to_port="input 1"/>
    <connect from_op="Execute Script" from_port="output 1" to_op="Loop Collection" to_port="collection"/>
    <connect from_op="Loop Collection" from_port="output 1" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="168"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>

     

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • cypressproject0cypressproject0 Member Posts: 5 Contributor I

    Thanks again Martin.

    I just need to extract the decision vector or in this case the possible vectors to graph these later. Correct me if I'm wrong, because I'm starting with this now and everything is new to me. Each decision in this metaclassifier will be defined by paths that may or may not be equal. The metaclassifier in this case constructs n trees and then does his work. This is so? With this script is it possible to get what I need? How would you do it?

    A greeting.

Sign In or Register to comment.