New Versions 0.3 for the Operator Toolbox and the Converters Extension available.
New Versions 0.3.0 for the Operator Toolbox and the Converters Extension available.
We are happy to release version 0.3.0 for Converters and Operator Toolbox Extensions. We worked hard to add new useful functionality as well as polished existing features. Without further waiting – here are the new features for your Data Science processes.
Converters - Decision Tree to ExampleSet
You can now convert a decision tree model into an ExampleSet. Each individual path in the tree is thereby represented by one row in the ExampleSet. The condition for the path is given as a nominal attribute, as well as the prediction and the number of examples collected in the leaf.
This is how it looks for a decision tree which was trained on the Iris data sample.
Results of the Decision Tree to ExampleSet Operator
Check out the attached process (see below) or the tutorial process in RapidMiner Studio.
Converters - Logistic Regression to ExampleSet
A logistic regression model can now be converted into an ExampleSet. The resulting ExampleSe t contains the Coefficients, Std. Coefficients and Std. Error as well as the z-Values and the p-Values.
This is what you get when you apply it on a Logistic Regression which was trained on the Deals Data Sample.
Results of the Logistic Regression to ExampleSet Operator
You can find a tutorial process attached to this Post.
Operator Toolbox - Create ExampleSet
This Operator can be used to create an ExampleSet from a text box. Just insert the data in a CSV-like format into the text box of the Operator. No need to create any test-CSV files anymore and as the Operator is part of the process, sharing test data is more easy than before.
Operator Toolbox - Set Parameters from ExampleSet
You can change now the Parameters of other Operators in your process, by passing the desired changes as an ExampleSet to the input of this new Operator.
Just create an ExampleSet with the Operator name, the Parameter name and the actual value of the Parameter as attributes. During execution of the process the Set Parameters from ExampleSet Operator change the Parameters of the corresponding Operators to the provided values.
Operator Toolbox - Set Macros from ExampleSet
If you want to provide a larger number of macros in a process, you can now use this new Operator to automatically do this. You provide an ExampleSet with the macro names and the macro values as attributes and the Operator sets the macros accordingly.
Operator Toolbox - Get Local Interpretation
This new Operator is a meta Operator to generate an approximation of the decision a given (complex) model made for specific examples. The basic idea is to generate local feature weights (“Interpretations") for every Example which can be easier interpreted. This can help to understand the “reasoning” for a decision of the complex model.
See in the following screenshot, how the results look like for some Examples for the decision of a Gradient Boosted Tree interpreted using the Weight by Gini Index. The corresponding tutorial process is attached to the Post.
Results of the Get Local Interpretation Operator
So, next time someone ask you why your model decides in a specific way you can use this Operator to provide an interpretation of this decision.
The algorithm is very similar to LIME. Details on Lime can be found here:
Operator Toolbox - Collect and Persist
Another Operator helpful for complex process setups is the new Collect and Persist Operator.
It is used to collect various object created during the execution of a process.
The Operator creates a new collection (holding the object provided at its input port) when it is executed the first time. The collection is then saved in the cache of the process. Subsequent use of the Operator will add more objects to the Collection.
Finally, the resulting collection can be retrieved by a simple “Recall from App” Operator.
The Operator can be used to collect arbitrary objects during an Optimization (for example all models and Performance Vectors).
Operator Toolbox - Filter Tokens using ExampleSet
This Operator is an extension to the Text Processing Extension. It is similar to the Filter Token (Dictionary) Operator, but it receives an ExampleSet as input for the filters.
It can be used inside any Process Documents Operator to filter for strings which you provide using an simple ExampleSet.