RapidMiner

How to know ... in a tree model or tree to rules?

SOLVED
Contributor II

How to know ... in a tree model or tree to rules?

Hello there.

I am new to using this tool and I would like you to help me with a doubt that has arisen.

 

The question is: Is it possible to know automatically which rule is sorting me an instance in the output of a tree model or a tree for rules?

 

In this case I have an entry with 200 000 examples and 80 attributes. The tree structure that generated the model tree has a depth of 8. So far so good, but now I need to know automatically which decision rule is being applied to classify a new instance.

 

Is this possible in rapidminer?

 

Thanks for helping

2 ACCEPTED SOLUTIONS

Accepted Solutions
Highlighted
Elite II
Solution
Accepted by topic author cypressproject0
‎06-09-2017 04:15 AM

Re: How to know ... in a tree model or tree to rules?

You can also use the "Operator Toolbox" Extension. This extension has an operator named "Get Decision Tree Path" that will produce a new variable with the path (a text variable) you have to follow to reach the final node for each observation. Be aware that this extension requires the Text Mining extension. So make sure you load it too. 

 

Here's the process:

Screen Shot 2017-06-07 at 8.49.19 AM.png

And here's the output:

Screen Shot 2017-06-07 at 8.50.45 AM.png

 

 

RMStaff
Solution
Accepted by topic author cypressproject0
‎06-09-2017 06:01 AM

Re: How to know ... in a tree model or tree to rules?

Hello,

 

this concrete operator will work with a meta cost operator, because it's delivering a pure decision tree. It will not work out of the box with ensemble operators like bagging.

 

By the way, in the converters extesion we will have a operator called Decision Tree to Example set. This converts the table you see in Description to an example set. I think it fits a bit better.

 

Best,

Martin

--------------------------------------------------------------------------
Head of Data Science Services at RapidMiner
8 REPLIES
Moderator

Re: How to know ... in a tree model or tree to rules?

You could use both but the easiest is clicking on the Description for the Decision Tree learner.

2017-06-07_10-08-50.png

 

 

 

 

Highlighted
Elite II
Solution
Accepted by topic author cypressproject0
‎06-09-2017 04:15 AM

Re: How to know ... in a tree model or tree to rules?

You can also use the "Operator Toolbox" Extension. This extension has an operator named "Get Decision Tree Path" that will produce a new variable with the path (a text variable) you have to follow to reach the final node for each observation. Be aware that this extension requires the Text Mining extension. So make sure you load it too. 

 

Here's the process:

Screen Shot 2017-06-07 at 8.49.19 AM.png

And here's the output:

Screen Shot 2017-06-07 at 8.50.45 AM.png

 

 

Contributor II

Re: How to know ... in a tree model or tree to rules?

Thank you very much, this is just what I needed!

Another question: do you think this tool can be applied to a metaclassifier like Metacost if the internal apprentice is a decision tree? If not, can you think of any other solution?

Thank you very much again.

RMStaff
Solution
Accepted by topic author cypressproject0
‎06-09-2017 06:01 AM

Re: How to know ... in a tree model or tree to rules?

Hello,

 

this concrete operator will work with a meta cost operator, because it's delivering a pure decision tree. It will not work out of the box with ensemble operators like bagging.

 

By the way, in the converters extesion we will have a operator called Decision Tree to Example set. This converts the table you see in Description to an example set. I think it fits a bit better.

 

Best,

Martin

--------------------------------------------------------------------------
Head of Data Science Services at RapidMiner
Contributor II

Re: How to know ... in a tree model or tree to rules?

Thank you very much, Thomas.

This path is useful, but I needed something like the solution below. Although as I advance in this the matter is complicated. The model I am developing has a metaclassifier as a base that in turn has a tree of apprentice decision. Knowing the route or routes that the model is following to make a decision is of vital importance for the project in which I am. Can you think of a solution?

Contributor II

Re: How to know ... in a tree model or tree to rules?

Hello Martin. Thank you very much for your help but rapidminer reports me error since the metaclassifier does not deliver a pure tree. It would be a good solution if my model were generated by a pure tree. I am bogged down with this affair.

A greeting.

RMStaff

Re: How to know ... in a tree model or tree to rules?

[ Edited ]

Whoops, 

my problem. MetaCost is of course also an ensemble model. Attached is a process with a small script converting the meta model into a collection of models and then transforming them into example sets. Note that does not take the application logic of MetaCost into account.

 

Best,

Martin

<?xml version="1.0" encoding="UTF-8"?><process version="7.5.001">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="6.0.002" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="retrieve" compatibility="7.5.001" expanded="true" height="68" name="Sonar" width="90" x="45" y="34">
        <parameter key="repository_entry" value="//Samples/data/Sonar"/>
      </operator>
      <operator activated="true" class="split_validation" compatibility="7.5.001" expanded="true" height="124" name="Validation" width="90" x="246" y="136">
        <process expanded="true">
          <operator activated="true" class="metacost" compatibility="7.5.001" expanded="true" height="82" name="MetaCost" width="90" x="112" y="30">
            <parameter key="cost_matrix" value="[0.0 2.0;3.0 0.0]"/>
            <process expanded="true">
              <operator activated="true" class="concurrency:parallel_decision_tree" compatibility="7.5.001" expanded="true" height="76" name="Decision Tree (2)" width="90" x="313" y="30"/>
              <connect from_port="training set" to_op="Decision Tree (2)" to_port="training set"/>
              <connect from_op="Decision Tree (2)" from_port="model" to_port="model"/>
              <portSpacing port="source_training set" spacing="0"/>
              <portSpacing port="sink_model" spacing="0"/>
            </process>
          </operator>
          <connect from_port="training" to_op="MetaCost" to_port="training set"/>
          <connect from_op="MetaCost" from_port="model" to_port="model"/>
          <portSpacing port="source_training" spacing="0"/>
          <portSpacing port="sink_model" spacing="0"/>
          <portSpacing port="sink_through 1" spacing="0"/>
        </process>
        <process expanded="true">
          <operator activated="true" class="apply_model" compatibility="7.1.001" expanded="true" height="82" name="Apply Model" width="90" x="45" y="30">
            <list key="application_parameters"/>
          </operator>
          <operator activated="true" class="performance_classification" compatibility="7.5.001" expanded="true" height="82" name="Performance" width="90" x="179" y="30">
            <list key="class_weights"/>
          </operator>
          <connect from_port="model" to_op="Apply Model" to_port="model"/>
          <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
          <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
          <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
          <portSpacing port="source_model" spacing="0"/>
          <portSpacing port="source_test set" spacing="0"/>
          <portSpacing port="source_through 1" spacing="0"/>
          <portSpacing port="sink_averagable 1" spacing="0"/>
          <portSpacing port="sink_averagable 2" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="execute_script" compatibility="7.5.001" expanded="true" height="82" name="Execute Script" width="90" x="447" y="34">
        <parameter key="script" value="&#10;&#10;import com.rapidminer.operator.learner.meta.MetaCostModel;&#10;import com.rapidminer.operator.IOObjectCollection;&#10;import com.rapidminer.operator.learner.PredictionModel;&#10;&#10;MetaCostModel meta = input[0]&#10;IOObjectCollection&lt;PredictionModel&gt; io = new IOObjectCollection()&#10;&#10;for(int i = 0; i&lt; meta.getNumberOfModels();++i){&#10;&#9;io.add(meta.getModel(i))&#10;}&#10;return io&#10;&#10;&#10;"/>
      </operator>
      <operator activated="true" class="loop_collection" compatibility="7.5.001" expanded="true" height="82" name="Loop Collection" width="90" x="581" y="34">
        <process expanded="true">
          <operator activated="true" class="converters:dectree_2_example_set" compatibility="0.3.000" expanded="true" height="82" name="Decision Tree to ExampleSet" width="90" x="380" y="85"/>
          <connect from_port="single" to_op="Decision Tree to ExampleSet" to_port="tree"/>
          <connect from_op="Decision Tree to ExampleSet" from_port="exa" to_port="output 1"/>
          <portSpacing port="source_single" spacing="0"/>
          <portSpacing port="sink_output 1" spacing="0"/>
          <portSpacing port="sink_output 2" spacing="0"/>
        </process>
      </operator>
      <connect from_op="Sonar" from_port="output" to_op="Validation" to_port="training"/>
      <connect from_op="Validation" from_port="model" to_op="Execute Script" to_port="input 1"/>
      <connect from_op="Execute Script" from_port="output 1" to_op="Loop Collection" to_port="collection"/>
      <connect from_op="Loop Collection" from_port="output 1" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="168"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>

 

--------------------------------------------------------------------------
Head of Data Science Services at RapidMiner
Contributor II

Re: How to know ... in a tree model or tree to rules?

Thanks again Martin.

I just need to extract the decision vector or in this case the possible vectors to graph these later. Correct me if I'm wrong, because I'm starting with this now and everything is new to me. Each decision in this metaclassifier will be defined by paths that may or may not be equal. The metaclassifier in this case constructs n trees and then does his work. This is so? With this script is it possible to get what I need? How would you do it?

A greeting.