RapidMiner

How to know ... in a tree model or tree to rules?

SOLVED
Learner I cypressproject0
Learner I

How to know ... in a tree model or tree to rules?

Hello there.

I am new to using this tool and I would like you to help me with a doubt that has arisen.

 

The question is: Is it possible to know automatically which rule is sorting me an instance in the output of a tree model or a tree for rules?

 

In this case I have an entry with 200 000 examples and 80 attributes. The tree structure that generated the model tree has a depth of 8. So far so good, but now I need to know automatically which decision rule is being applied to classify a new instance.

 

Is this possible in rapidminer?

 

Thanks for helping

8 REPLIES
RM Certified Expert
RM Certified Expert

Re: How to know ... in a tree model or tree to rules?

You could use both but the easiest is clicking on the Description for the Decision Tree learner.

2017-06-07_10-08-50.png

 

 

 

 

Highlighted
Guru
Guru
Solution

Re: How to know ... in a tree model or tree to rules?

You can also use the "Operator Toolbox" Extension. This extension has an operator named "Get Decision Tree Path" that will produce a new variable with the path (a text variable) you have to follow to reach the final node for each observation. Be aware that this extension requires the Text Mining extension. So make sure you load it too. 

 

Here's the process:

Screen Shot 2017-06-07 at 8.49.19 AM.png

And here's the output:

Screen Shot 2017-06-07 at 8.50.45 AM.png

 

 

Learner I cypressproject0
Learner I

Re: How to know ... in a tree model or tree to rules?

Thank you very much, this is just what I needed!

Another question: do you think this tool can be applied to a metaclassifier like Metacost if the internal apprentice is a decision tree? If not, can you think of any other solution?

Thank you very much again.

RM Staff
RM Staff
Solution

Re: How to know ... in a tree model or tree to rules?

Hello,

 

this concrete operator will work with a meta cost operator, because it's delivering a pure decision tree. It will not work out of the box with ensemble operators like bagging.

 

By the way, in the converters extesion we will have a operator called Decision Tree to Example set. This converts the table you see in Description to an example set. I think it fits a bit better.

 

Best,

Martin

--------------------------------------------------------------------------
Head of Data Science Services at RapidMiner
Learner I cypressproject0
Learner I

Re: How to know ... in a tree model or tree to rules?

Thank you very much, Thomas.

This path is useful, but I needed something like the solution below. Although as I advance in this the matter is complicated. The model I am developing has a metaclassifier as a base that in turn has a tree of apprentice decision. Knowing the route or routes that the model is following to make a decision is of vital importance for the project in which I am. Can you think of a solution?

Learner I cypressproject0
Learner I

Re: How to know ... in a tree model or tree to rules?

Hello Martin. Thank you very much for your help but rapidminer reports me error since the metaclassifier does not deliver a pure tree. It would be a good solution if my model were generated by a pure tree. I am bogged down with this affair.

A greeting.

RM Staff
RM Staff

Re: How to know ... in a tree model or tree to rules?

Whoops, 

my problem. MetaCost is of course also an ensemble model. Attached is a process with a small script converting the meta model into a collection of models and then transforming them into example sets. Note that does not take the application logic of MetaCost into account.

 

Best,

Martin

<?xml version="1.0" encoding="UTF-8"?><process version="7.5.001">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="6.0.002" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="retrieve" compatibility="7.5.001" expanded="true" height="68" name="Sonar" width="90" x="45" y="34">
        <parameter key="repository_entry" value="//Samples/data/Sonar"/>
      </operator>
      <operator activated="true" class="split_validation" compatibility="7.5.001" expanded="true" height="124" name="Validation" width="90" x="246" y="136">
        <process expanded="true">
          <operator activated="true" class="metacost" compatibility="7.5.001" expanded="true" height="82" name="MetaCost" width="90" x="112" y="30">
            <parameter key="cost_matrix" value="[0.0 2.0;3.0 0.0]"/>
            <process expanded="true">
              <operator activated="true" class="concurrency:parallel_decision_tree" compatibility="7.5.001" expanded="true" height="76" name="Decision Tree (2)" width="90" x="313" y="30"/>
              <connect from_port="training set" to_op="Decision Tree (2)" to_port="training set"/>
              <connect from_op="Decision Tree (2)" from_port="model" to_port="model"/>
              <portSpacing port="source_training set" spacing="0"/>
              <portSpacing port="sink_model" spacing="0"/>
            </process>
          </operator>
          <connect from_port="training" to_op="MetaCost" to_port="training set"/>
          <connect from_op="MetaCost" from_port="model" to_port="model"/>
          <portSpacing port="source_training" spacing="0"/>
          <portSpacing port="sink_model" spacing="0"/>
          <portSpacing port="sink_through 1" spacing="0"/>
        </process>
        <process expanded="true">
          <operator activated="true" class="apply_model" compatibility="7.1.001" expanded="true" height="82" name="Apply Model" width="90" x="45" y="30">
            <list key="application_parameters"/>
          </operator>
          <operator activated="true" class="performance_classification" compatibility="7.5.001" expanded="true" height="82" name="Performance" width="90" x="179" y="30">
            <list key="class_weights"/>
          </operator>
          <connect from_port="model" to_op="Apply Model" to_port="model"/>
          <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
          <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
          <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
          <portSpacing port="source_model" spacing="0"/>
          <portSpacing port="source_test set" spacing="0"/>
          <portSpacing port="source_through 1" spacing="0"/>
          <portSpacing port="sink_averagable 1" spacing="0"/>
          <portSpacing port="sink_averagable 2" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="execute_script" compatibility="7.5.001" expanded="true" height="82" name="Execute Script" width="90" x="447" y="34">
        <parameter key="script" value="&#10;&#10;import com.rapidminer.operator.learner.meta.MetaCostModel;&#10;import com.rapidminer.operator.IOObjectCollection;&#10;import com.rapidminer.operator.learner.PredictionModel;&#10;&#10;MetaCostModel meta = input[0]&#10;IOObjectCollection&lt;PredictionModel&gt; io = new IOObjectCollection()&#10;&#10;for(int i = 0; i&lt; meta.getNumberOfModels();++i){&#10;&#9;io.add(meta.getModel(i))&#10;}&#10;return io&#10;&#10;&#10;"/>
      </operator>
      <operator activated="true" class="loop_collection" compatibility="7.5.001" expanded="true" height="82" name="Loop Collection" width="90" x="581" y="34">
        <process expanded="true">
          <operator activated="true" class="converters:dectree_2_example_set" compatibility="0.3.000" expanded="true" height="82" name="Decision Tree to ExampleSet" width="90" x="380" y="85"/>
          <connect from_port="single" to_op="Decision Tree to ExampleSet" to_port="tree"/>
          <connect from_op="Decision Tree to ExampleSet" from_port="exa" to_port="output 1"/>
          <portSpacing port="source_single" spacing="0"/>
          <portSpacing port="sink_output 1" spacing="0"/>
          <portSpacing port="sink_output 2" spacing="0"/>
        </process>
      </operator>
      <connect from_op="Sonar" from_port="output" to_op="Validation" to_port="training"/>
      <connect from_op="Validation" from_port="model" to_op="Execute Script" to_port="input 1"/>
      <connect from_op="Execute Script" from_port="output 1" to_op="Loop Collection" to_port="collection"/>
      <connect from_op="Loop Collection" from_port="output 1" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="168"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>

 

--------------------------------------------------------------------------
Head of Data Science Services at RapidMiner
Learner I cypressproject0
Learner I

Re: How to know ... in a tree model or tree to rules?

Thanks again Martin.

I just need to extract the decision vector or in this case the possible vectors to graph these later. Correct me if I'm wrong, because I'm starting with this now and everything is new to me. Each decision in this metaclassifier will be defined by paths that may or may not be equal. The metaclassifier in this case constructs n trees and then does his work. This is so? With this script is it possible to get what I need? How would you do it?

A greeting.

Polls
How can RapidMiner increase participation in our new competitions?
Twitter Feed