RapidMiner

The new Get Local Interpretation Operator...

RM Certified Analyst
RM Certified Analyst

The new Get Local Interpretation Operator...

Colleagues:

 

I have been experimenting with the Get Local Interpretation operator introduced in 7.6 that is included in the RapidMiner extension "Operator Toolbox" under the "Models" sub-folder.  I have tried using this operator with a grouped model (normalisation and kNN model) and get an error.  The error (screen shot attached) states that I am passing the wrong type of input to the operator. 

 

With a kNN, I have heard that it is best to normalize the input data, and then group the normalization model with the kNN model using the Group Models operator.  The error message seems to indicate that the Get Local Interpretation operator cannot accept Grouped Models as an input.  

 

I would be grateful for any advise as to how to get around this if at all possible given the usefullness of the Get Local Interpretation operator, and given that it is sometimes neccessary to normalize the input data that will be processed by a learner.

 

Best wishes, Michael Martin

19 REPLIES
RM Certified Expert
RM Certified Expert

Re: The new Get Local Interpretation Operator...

It’s hard to tell from the screenshot because the error message is hiding the operators. Is the model port receiving a model? It seems that the air is related to receiving not a model type output as the input.
RM Certified Analyst
RM Certified Analyst

Re: The new Get Local Interpretation Operator...

Hallo Thomas:

 

Thanks for your reply.  I think the attached screenshots may help.  The mod input port for the Get Local Interptretation operator is receiving a Grouped Model (from a Multiply operator).  The error seems to be related to the fact that the GLI operator wants a Prediction model, not a Grouped Model that includes a Prediction Model.

 

If you have any suggestions as to how I could work around this, I would be grateful as it is sometimes important to normalize the data, especially with a kNN Learner and within a Cross Validation operator when you want the data in each fold normalized seperately before the data is fed to the Learner.

 

Thanks for considering this if you get a chance, and best wishes,

 

Michael

RM Staff
RM Staff

Re: The new Get Local Interpretation Operator...

Dear Michael,

 

i am the author of the operator and messed it up. The operator expects a prediction model, which is different from a Group (or meta) model.

 

I will check how to fix this.

 

for now: Normalize the data before GLI and just use the pure k-NN in it. That should work.

 

Best,

Martin

--------------------------------------------------------------------------
Head of Data Science Services at RapidMiner
Highlighted
RM Staff
RM Staff

Re: The new Get Local Interpretation Operator...

Dear Michael,

 

i've sent you an updated version of toolbox which is able to handle GroupedModels.

 

A bit on the background: RapidMiner has a Model class (AbstractModel) which has various child classes. I expected to get a PredictionModel. Most RM Models are implementations of this. I now learned that GroupedModel (and some others) aren't. I've switched the implementation so that you can add any (Abstract)model.

Downside: You can now connect also models which make no sense (Normalization, Nom2Numeric...). I need to figure out a way how to check if the connected model creates a label. Not sure how yet. In the version i shared it simply crashes in the inner operator.

 

Best,

Martin

--------------------------------------------------------------------------
Head of Data Science Services at RapidMiner
RM Certified Analyst
RM Certified Analyst

Re: The new Get Local Interpretation Operator...

Lieber Martin:   

 

Vielen dank fuer Deine Meldung - und auch vielen dank fuer die Erklaerung! ;-)   I checked for an update to the Toolbox extension in the Marketplace, I don't see anything in my email inbox.  How can I update the Toolbox to the new version?   

 

I will then try it out and let you know how things go.  GLI is a very useful operator - thank you for developing it!  I have used GLI in sales presentations (that visualized outputs in Tableau) and business people really like it.  The examples I used were based on Decision Tree learners.  One thing I want to do is feed various Decision Tree model outputs to GLI and Tree to Rules and see what is the same and what is different.

 

MfG, Michael ;-)

RM Staff
RM Staff

Re: The new Get Local Interpretation Operator...

Hi Michael,

 

what you can do is use an optimize in the GLI and take the DecTree which describes the best? Is that what you need?

 

Once I managed to make GLI runnable on single examples one might simply use a loop over models around..

 

Best,

Martin

--------------------------------------------------------------------------
Head of Data Science Services at RapidMiner
RM Certified Analyst
RM Certified Analyst

Re: The new Get Local Interpretation Operator...

Hallo Martin:

 

I just tested a process in which I normalize the data (0-1 and not z score, which produces negative values, which GLI seems not to like) before feeding the data to Optimize Parameters, etc.  I used the configuration in one of the tutorials (Use Weight by Gini Index) in the help for the GLI operator within my process.  

 

The process runs through OK, but I get no "Decision Tree Path" (the field is blank for all rows in the dataset) in the output.  I do get Importance values for various attributes.

 

What I would like to be able to do is pass my grouped model to the GLI operator as per my original post.  If I understood you corrently, there is now a version of the GLI operator that would allow me to do this.  Is there an updated operator than I can get access to?

 

I can also see that I should do some studying to make sure I am using the GLI operator correctly.  It enables great outputs, but the driver (me) has to drive the car (the GLI operator) correctly!

 

Best wishes, Michael ;-)

RM Certified Analyst
RM Certified Analyst

Re: The new Get Local Interpretation Operator...

Hi Martin: 

 

I now get a Decision Tree Path in the test process I wrote about a little while ago - I was not using enough attributes to generate one.  As I know that data very well, I can tell that the Decision Tree Path is not a complete map of the data - but I am sure that is because I need to do some optimising of the learner within the GLI operator.  

 

If there is a way for me to pass a Grouped Model to the GLI, that would be great.  ;-)

 

MfG,  Michael

RM Staff
RM Staff

Re: The new Get Local Interpretation Operator...

Hi Michael,

 

check your Private Messages. I've sent you a version with a working GLI for grouped models. I will investigate the issue with negative values.

 

Best,

Martin

 

Edit: Negative Data works good fo rme? See attched example process.

 

<?xml version="1.0" encoding="UTF-8"?><process version="7.6.001">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="7.6.001" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="false" class="retrieve" compatibility="7.6.001" expanded="true" height="68" name="Retrieve Iris" width="90" x="45" y="34">
        <parameter key="repository_entry" value="//Samples/data/Iris"/>
      </operator>
      <operator activated="true" class="generate_data" compatibility="7.6.001" expanded="true" height="68" name="Generate Data" width="90" x="179" y="34">
        <parameter key="target_function" value="quadratic classification"/>
      </operator>
      <operator activated="true" class="h2o:gradient_boosted_trees" compatibility="7.6.001" expanded="true" height="103" name="Gradient Boosted Trees" width="90" x="380" y="34">
        <parameter key="reproducible" value="true"/>
        <list key="expert_parameters"/>
      </operator>
      <operator activated="true" class="operator_toolbox:get_interpretation_subprocess" compatibility="0.5.001" expanded="true" height="124" name="Get Local Interpretation" width="90" x="514" y="34">
        <process expanded="true">
          <operator activated="true" class="multiply" compatibility="7.6.001" expanded="true" height="103" name="Multiply" width="90" x="44" y="34"/>
          <operator activated="true" class="weight_by_gini_index" compatibility="7.6.001" expanded="true" height="82" name="Weight by Gini Index (2)" width="90" x="179" y="34"/>
          <operator activated="true" class="concurrency:parallel_decision_tree" compatibility="7.6.001" expanded="true" height="82" name="Decision Tree" width="90" x="179" y="136">
            <parameter key="maximal_depth" value="5"/>
            <parameter key="apply_pruning" value="false"/>
            <parameter key="apply_prepruning" value="false"/>
          </operator>
          <connect from_port="training set" to_op="Multiply" to_port="input"/>
          <connect from_op="Multiply" from_port="output 1" to_op="Weight by Gini Index (2)" to_port="example set"/>
          <connect from_op="Multiply" from_port="output 2" to_op="Decision Tree" to_port="training set"/>
          <connect from_op="Weight by Gini Index (2)" from_port="weights" to_port="Weight Vector"/>
          <connect from_op="Decision Tree" from_port="model" to_port="Prediction Model"/>
          <portSpacing port="source_training set" spacing="0"/>
          <portSpacing port="sink_Weight Vector" spacing="0"/>
          <portSpacing port="sink_Prediction Model" spacing="0"/>
          <portSpacing port="sink_Performance Vector" spacing="0"/>
        </process>
      </operator>
      <connect from_op="Generate Data" from_port="output" to_op="Gradient Boosted Trees" to_port="training set"/>
      <connect from_op="Gradient Boosted Trees" from_port="model" to_op="Get Local Interpretation" to_port="mod"/>
      <connect from_op="Gradient Boosted Trees" from_port="exampleSet" to_op="Get Local Interpretation" to_port="exa"/>
      <connect from_op="Get Local Interpretation" from_port="exa" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>
--------------------------------------------------------------------------
Head of Data Science Services at RapidMiner