RapidMiner

‎09-16-2017 07:15 PM

Keras is a high level neural network API, supporting popular deep learning libraries like Tensorflow, Microsoft Cognitive Toolkit, and Theano.


The RapidMiner Keras extension provides a set of operators that allow an easy visual configuration of Deep Learning network structures and layers. Calculations are pushed into the Python-based backend libraries, so you can leverage the computing power of GPUs and grid environments. 

The extension makes use of an existing Keras installation. This article shows how to do a simple deployment of Keras and how to configure the Keras extension to connect to it.

 

Let's review several options:

 

Anaconda on MacOS

 

Warning: As of version 1.2, TensorFlow no longer provides GPU support on macOS. 

  1. Download and install Anaconda from: https://www.continuum.io/downloads#macos
  2. Create a new environment by typing in command line: conda create –n keras
  3. Activate the created environment by typing in the command line: source activate keras
  4. Install pandas by typing in the command line: conda install pandas
  5. Install scikit-learn by typing in the command line: conda install scikit-learn
  6. Install keras by typing in the command line: conda install -c conda-forge keras
  7. Install graphviz by typing in the command line: conda install –c anaconda graphviz
  8. Install pydotplus by typing in the commandline conda install –c conda-forge pydotplus
  9. In RapidMiner Studio Keras and Python Scripting panels in preferences, specify the path to your new conda environment Python executable.

 

You’re good to go!


Anaconda on Windows

 

Warning: Due to issues with package dependencies, it is not currently possible to install graphviz and pydot in a conda environment on Windows, and consequently to visualise the model graph in the results panel.

 

  1. Download and install Anaconda from: https://www.continuum.io/downloads#windows
  2. Create a new environment with Python 3.5.2 by typing in command line: conda create –n Python35 python=3.5.2
  3. Activate the created environment by typing in the command line: activate Python35
  4. Install pandas by typing in the command line: conda install pandas
  5. Install scikit-learn by typing in the command line: conda install scikit-learn
  6. Install keras by typing in the command line: conda install -c jaikumarm keras=2.0.4
  7. In RapidMiner Studio Keras and Python Scripting panels in preferences, specify the path to your new conda environment Python executable.

 

You’re good to go!

 

 

Windows

 

  1. Download and install Python 3.5.2 from: https://www.python.org/downloads/release/python-352/

Only python 3.5.2 works for windows.

  1. Install numpy with Intel Math Kernel library.
  1. Install pandas from the command line: pip3 install pandas
  2. Install graphviz from the command line: pip3 install graphviz
  3. Install pydot from the command line: pip3 install pydot
  4. Install TensorFlow.
    • If you would like to install TensorFlow with GPU support, please see the instructions here: https://www.tensorflow.org/install/install_windows
    • If you would like to install TensorFlow only with CPU support, from the command line run: pip3 install –upgrade tensorflow
  5. Install Keras from the command line: pip3 install keras

 

You’re good to go!

 

 

 

RapidMiner extension

 

  1. Install the Keras extension from the RapidMiner Marketplace
  2. Install RapidMiner Python Scripting extension from the marketplace if not already installed.
  3. Restart RapidMiner Studio.
  4. Inside your Studio Client go to Settings (Menu) > Preferences and navigate to “Python Scripting” tab/page on the left. Provide path to Python executable and click test to ensure it is successful.
  5. Inside your Studio Client go to Settings (Menu) >Preferences and navigate to “Keras” tab/page on the left. Provide path to Python executable and click test to ensure it is successful.

 

Try out a few sample processes from the “Keras Sample” in the repository view.

 Capture.PNG

 

 

Comments
Contributor I hermawan_eriadi
Contributor I

Thanks for the answer @pschlunder

I find some problem.. When i run the sample, SP_500_Regression, I find that "Apply Keras Model" can test all the testing examples. It always can't predict 11 last examples.. I try also with another process, it result the same thing.. What's the miss ?

 

What's the number of Loss that represent good result? In my model, I just can get loss 0.2 for the best number with Binary_accuracy or Categorial_Accuracy just <50%. Even when I rise the epoch about 5000. Is it mean that my model not deep enough ?

 

Thanks..

Learner III 56005393_t30l41
Learner III

 

Error. Implement Stacked LSTMs in RapidMiner 

I got error message..

ValueError: Input 0 is incompatible with layer lstm_2: expected ndim=3, found ndim=2

 

 

 

Learner III j_vreugdenhil
Learner III

 Does somebody know an easy method to convert 2d data to 3d using RM for feeding into a RNN or LSTM network? Online i ve found methods using Python but thats something I wasnt looking for.

Contributor I hermawan_eriadi
Contributor I

Dear @j_vreugdenhil

Try using "Add Core Layer" operator and use Layer Type: Reshape. You can fill the "Target Shape" by dimension you want. Ex: (4,77) for 3 dimension with (Examples, 4 timeslap, 77 attributes).

Learner III j_vreugdenhil
Learner III

Tnx! I ll try this. Will reshape automatically convert my 2d dataset (x examples, 3 features) to a 3d dataset (x examples, each x timesteps, 3 features)?

Contributor I hermawan_eriadi
Contributor I

Yeah.. try (1,3). in chase only one 1 timesteps.

Newbie pxkst970
Newbie

 

Dear @j_vreugdenhil

I'm having the same issue you've had. I already tried to do "Add Core Layer" using "Reshape" layer type, but still got the same errors. Did you find out how to make it work?

Contributor I dass
Contributor I

Hi everyone, 

Is there any way to implement stack lstm in keras model?It seems like i can't find the 'return_sequence' parameter in the recurrent layer. thank you guys.

 

RM Certified Analyst
RM Certified Analyst

@pxkst970 Until now the only way I have found to make the LSTM works on the RNN operator on RM is to fix the input_shape parameter on the operator Keras Model to (n, 1) where n = number or examples in the RM dataframe.

 

Then inside the Keras Model operator in the target_shape of the Core Layer, reshape from 2D to 3D with the same shape. I suposse that should be many other configurations to make it works but for now I've only found this way.

 

@dass Currently I'm looking too for a stacked LSTM because there's no return_sequences parameters inside the RNN operator.

Contributor I Montse
Contributor I

Hi @israel_jimenez,

Could you send the XML process? I can't get it to work with the specifications you have said.

 

Regards,

Montse

RM Certified Analyst
RM Certified Analyst

@Montse my bad I've just realize that I've said "n = number of examples" when it should be "number of atributes".

 

It's a very non-sense naive implementation just to review the architecture of the LSTM Keras implementation.

 

Regards.

<?xml version="1.0" encoding="UTF-8"?><process version="8.1.003">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="8.1.003" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="generate_data" compatibility="8.1.003" expanded="true" height="68" name="Generate Data" width="90" x="45" y="34">
        <parameter key="number_examples" value="10000"/>
        <parameter key="number_of_attributes" value="1"/>
      </operator>
      <operator activated="true" class="subprocess" compatibility="8.1.003" expanded="true" height="82" name="Subprocess" width="90" x="179" y="34">
        <process expanded="true">
          <operator activated="true" class="select_attributes" compatibility="8.1.003" expanded="true" height="82" name="Select Attributes" width="90" x="45" y="34">
            <parameter key="attribute_filter_type" value="single"/>
            <parameter key="attribute" value="label"/>
            <parameter key="invert_selection" value="true"/>
            <parameter key="include_special_attributes" value="true"/>
          </operator>
          <operator activated="true" class="series:windowing" compatibility="7.4.000" expanded="true" height="82" name="Windowing" width="90" x="179" y="34">
            <parameter key="window_size" value="5"/>
          </operator>
          <operator activated="true" class="set_role" compatibility="8.1.003" expanded="true" height="82" name="Set Role" width="90" x="313" y="34">
            <parameter key="attribute_name" value="att1-4"/>
            <parameter key="target_role" value="label"/>
            <list key="set_additional_roles"/>
          </operator>
          <connect from_port="in 1" to_op="Select Attributes" to_port="example set input"/>
          <connect from_op="Select Attributes" from_port="example set output" to_op="Windowing" to_port="example set input"/>
          <connect from_op="Windowing" from_port="example set output" to_op="Set Role" to_port="example set input"/>
          <connect from_op="Set Role" from_port="example set output" to_port="out 1"/>
          <portSpacing port="source_in 1" spacing="0"/>
          <portSpacing port="source_in 2" spacing="0"/>
          <portSpacing port="sink_out 1" spacing="0"/>
          <portSpacing port="sink_out 2" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="split_data" compatibility="8.1.003" expanded="true" height="103" name="Split Data" width="90" x="313" y="34">
        <enumeration key="partitions">
          <parameter key="ratio" value="0.7"/>
          <parameter key="ratio" value="0.3"/>
        </enumeration>
      </operator>
      <operator activated="true" class="keras:sequential" compatibility="1.0.003" expanded="true" height="166" name="Keras Model" width="90" x="447" y="34">
        <parameter key="input shape" value="(5, 1)"/>
        <parameter key="optimizer" value="Adam"/>
        <parameter key="momentum" value="0.01"/>
        <enumeration key="metric"/>
        <parameter key="epochs" value="50"/>
        <enumeration key="callbacks"/>
        <process expanded="true">
          <operator activated="true" class="keras:core_layer" compatibility="1.0.003" expanded="true" height="82" name="Add Core Layer" width="90" x="112" y="34">
            <parameter key="layer_type" value="Reshape"/>
            <parameter key="target_shape" value="(5, 1)"/>
            <parameter key="dims" value="1.1"/>
          </operator>
          <operator activated="true" class="keras:recurrent_layer" compatibility="1.0.003" expanded="true" height="82" name="Add Recurrent Layer" width="90" x="246" y="34">
            <parameter key="layer_type" value="LSTM"/>
            <parameter key="no_units" value="32"/>
          </operator>
          <operator activated="true" class="keras:core_layer" compatibility="1.0.003" expanded="true" height="82" name="Add Core Layer (2)" width="90" x="380" y="34">
            <parameter key="activation_function" value="'linear'"/>
            <parameter key="dims" value="1.1"/>
          </operator>
          <connect from_op="Add Core Layer" from_port="layers 1" to_op="Add Recurrent Layer" to_port="layers"/>
          <connect from_op="Add Recurrent Layer" from_port="layers 1" to_op="Add Core Layer (2)" to_port="layers"/>
          <connect from_op="Add Core Layer (2)" from_port="layers 1" to_port="layers 1"/>
          <portSpacing port="sink_layers 1" spacing="0"/>
          <portSpacing port="sink_layers 2" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="keras:apply" compatibility="1.0.003" expanded="true" height="82" name="Apply Keras Model" width="90" x="581" y="34"/>
      <connect from_op="Generate Data" from_port="output" to_op="Subprocess" to_port="in 1"/>
      <connect from_op="Subprocess" from_port="out 1" to_op="Split Data" to_port="example set"/>
      <connect from_op="Split Data" from_port="partition 1" to_op="Keras Model" to_port="training set"/>
      <connect from_op="Split Data" from_port="partition 2" to_op="Apply Keras Model" to_port="unlabelled data"/>
      <connect from_op="Keras Model" from_port="model" to_op="Apply Keras Model" to_port="model"/>
      <connect from_op="Apply Keras Model" from_port="labelled data" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>
RM Certified Analyst
RM Certified Analyst

 

 

I really appreciate any help you can provide

@dgrzech, I am back at RM Keras. I have been playing with a standard MNIST example and comparing the same model performance against those running on R/R Studio and Python/Anaconda. I have stumbled on an issue while plotting the results using Tensorboard. The common metric used in measuring the classifier is "accuracy" or "acc", which will be then translated by Keras into something more appropriate depending on the output shape, so in MNIST it will be "categorical_accuracy". The RM Keras plugin is trying to be smarter than humans working with Python and R and it eliminates "accuracy" and "acc" as an option for such models. This causes problems with Tensorboard, as then the Python and R generated logs will show as "acc" and "val_acc" panels and the RM logs will be placed in separate panels "categorical_accuracy" and "val_categorical_accuracy". We need to bring the "accuracy" back to the Keras plugin metric options! By the way, the issue can be "fixed" by finding the offending option in XML and changing it there, then the RM Keras logs will rejoin the rest of the world  Smiley Happy

 

Jacob

 

P.S. Interestingly when running MNIST over 30 epochs, compared with R (orange and dark blue) and Python (brown and light blue), RM (purple) consistently produces best validation loss but its validation accuracy places it in the middle. The Python beats all, R is the loser.

Tensorboard - Five Runs.PNG

Learner II yongsr
Learner II

Can this run on Linux?

 

I tried to install numpy with MKL, I think there is no such package in Linux.

@yongsr, look at my previous post on Keras installation on Linux. You do not need MKL as most likely your Linux (I use Ubuntu) will have all te necessary Math libraries after you successfully install NVIDIA Toolkit. The key to Keras installation (on all platforms) is Tensorflow (if this is your back end). Works all the requirements starting with Tensorflow down the stack Anaconda > cuDNN > NVIDIA Toolkit > GPU drivers. Do not install any software newer than what Tensorflow recommends. After this work up the stack with Keras and then RapidMiner. Keras requires a few extra libraries and the bottleneck is always graphviz (make sure you apt-get install it, and also conda install both graphviz and python-graphviz) and pydot (which you may need to pip install rather than with conda). After this you are set to go.

Learner II yongsr
Learner II

@jacobcybulski, I tried it on Ubuntu 16.04, it works perfectly.

Learner III varunm1
Learner III

Hello,  

@pschlunder @jpuente @jacobcybulski @sgenzer

 

I am trying to apply Recurrent Network with simple RNN. I am encountering issue with input dimensions. The input is a data with 408 attributes, 1 label and 1028 samples. The error states that expected simple_rnn_1 input to have 3 dimensions but got an array of shape (1029,408). I gave input shape of keras model as (1,408) where 1 is the time step and 408 is the number of attributed in my dataset. I also used core layer in keras model with "layer type" Reshape and target shape (1,408) Batch size is 10. But still I am unable to understand why I am encountering this issue. Your help is much appreciated.

 

Please find XML code below.    

 

<?xml version="1.0" encoding="UTF-8"?><process version="8.2.000">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="8.2.000" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="8.2.000" expanded="true" height="68" name="Retrieve 2 Clip ORG CB &amp; T2" width="90" x="112" y="85">
<parameter key="repository_entry" value="//Local Repository/data/2 Clip ORG CB &amp; T2"/>
</operator>
<operator activated="true" class="split_data" compatibility="8.2.000" expanded="true" height="103" name="Split Data" width="90" x="179" y="238">
<enumeration key="partitions">
<parameter key="ratio" value="0.8"/>
<parameter key="ratio" value="0.2"/>
</enumeration>
</operator>
<operator activated="true" class="keras:sequential" compatibility="1.0.003" expanded="true" height="166" name="Keras Model" width="90" x="380" y="85">
<parameter key="input shape" value="(1,408)"/>
<parameter key="optimizer" value="Adam"/>
<enumeration key="metric"/>
<enumeration key="callbacks"/>
<process expanded="true">
<operator activated="true" class="keras:core_layer" compatibility="1.0.003" expanded="true" height="82" name="Add Core Layer" width="90" x="179" y="85">
<parameter key="layer_type" value="Reshape"/>
<parameter key="target_shape" value="(1,408)"/>
<parameter key="dims" value="1.1"/>
</operator>
<operator activated="true" class="keras:recurrent_layer" compatibility="1.0.003" expanded="true" height="82" name="Add Recurrent Layer" width="90" x="380" y="85">
<parameter key="no_units" value="200"/>
<parameter key="activation" value="softmax"/>
</operator>
<connect from_op="Add Core Layer" from_port="layers 1" to_op="Add Recurrent Layer" to_port="layers"/>
<connect from_op="Add Recurrent Layer" from_port="layers 1" to_port="layers 1"/>
<portSpacing port="sink_layers 1" spacing="0"/>
<portSpacing port="sink_layers 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="keras:apply" compatibility="1.0.003" expanded="true" height="82" name="Apply Keras Model" width="90" x="581" y="136"/>
<connect from_op="Retrieve 2 Clip ORG CB &amp; T2" from_port="output" to_op="Split Data" to_port="example set"/>
<connect from_op="Split Data" from_port="partition 1" to_op="Keras Model" to_port="training set"/>
<connect from_op="Split Data" from_port="partition 2" to_op="Apply Keras Model" to_port="unlabelled data"/>
<connect from_op="Keras Model" from_port="model" to_op="Apply Keras Model" to_port="model"/>
<connect from_op="Apply Keras Model" from_port="labelled data" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>

Contributor I hermawan_eriadi
Contributor I
Dear @varunm1 Try using input shape (1,1029,408).
Learner III varunm1
Learner III

Hello @hermawan_eriadi,

 

Thanks for your response. I changed it as specified. Now its throwing an error stating "expected ndim=3 but found ndim = 4. Please find XML below and also the model images with error below.

 

<?xml version="1.0" encoding="UTF-8"?><process version="8.2.000">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="8.2.000" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="8.2.000" expanded="true" height="68" name="Retrieve 2 Clip ORG CB &amp; T2" width="90" x="45" y="85">
<parameter key="repository_entry" value="//Local Repository/data/2 Clip ORG CB &amp; T2"/>
</operator>
<operator activated="true" class="series:windowing" compatibility="7.4.000" expanded="true" height="82" name="Windowing" width="90" x="179" y="85">
<parameter key="window_size" value="1"/>
</operator>
<operator activated="true" class="split_data" compatibility="8.2.000" expanded="true" height="103" name="Split Data" width="90" x="313" y="136">
<enumeration key="partitions">
<parameter key="ratio" value="0.8"/>
<parameter key="ratio" value="0.2"/>
</enumeration>
</operator>
<operator activated="true" class="keras:sequential" compatibility="1.0.003" expanded="true" height="166" name="Keras Model" width="90" x="447" y="34">
<parameter key="input shape" value="(1,1028,408)"/>
<parameter key="loss" value="binary_crossentropy"/>
<parameter key="optimizer" value="Adam"/>
<enumeration key="metric"/>
<enumeration key="callbacks"/>
<process expanded="true">
<operator activated="true" class="keras:recurrent_layer" compatibility="1.0.003" expanded="true" height="82" name="Add Recurrent Layer" width="90" x="447" y="85">
<parameter key="no_units" value="200"/>
<parameter key="recurrent_initializer" value="Orthogonal(gain=1.0, seed=None)"/>
<parameter key="implementation" value="1"/>
</operator>
<operator activated="true" class="keras:core_layer" compatibility="1.0.003" expanded="true" height="82" name="Add Core Layer" width="90" x="782" y="85">
<parameter key="no_units" value="2"/>
<parameter key="activation_function" value="'softmax'"/>
<parameter key="dims" value="1.1"/>
</operator>
<connect from_op="Add Recurrent Layer" from_port="layers 1" to_op="Add Core Layer" to_port="layers"/>
<connect from_op="Add Core Layer" from_port="layers 1" to_port="layers 1"/>
<portSpacing port="sink_layers 1" spacing="0"/>
<portSpacing port="sink_layers 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="keras:apply" compatibility="1.0.003" expanded="true" height="82" name="Apply Keras Model" width="90" x="782" y="136"/>
<connect from_op="Retrieve 2 Clip ORG CB &amp; T2" from_port="output" to_op="Windowing" to_port="example set input"/>
<connect from_op="Windowing" from_port="example set output" to_op="Split Data" to_port="example set"/>
<connect from_op="Split Data" from_port="partition 1" to_op="Keras Model" to_port="training set"/>
<connect from_op="Split Data" from_port="partition 2" to_op="Apply Keras Model" to_port="unlabelled data"/>
<connect from_op="Keras Model" from_port="model" to_op="Apply Keras Model" to_port="model"/>
<connect from_op="Apply Keras Model" from_port="labelled data" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>

Rapid_error_1.JPGRapid_error_2.JPG

 

Contributor I hermawan_eriadi
Contributor I

Using input shape and Target Shape (1,408) should be worked. I have tried it with my own data, it's work, even without Reshape Layer.

Try to change the Window Size in Windowing operator by 2 or more number, and change input shape by (window_size,408). If it work, try back window size=1 and change input shape again. It should be do the trick...

 

Make sure to check that you still have the label attribute after Windowing Operator..

Learner III varunm1
Learner III

@hermawan_eriadi

Thanks again for your response. I tried using windowing 2 and keras input shape (2,408) attributes. Now its saying "list index out of range". I am using two layers in keras model 1. Simple RNN with 200 units, Activation "Relu" and then connected it to a core layer "Dense" with 2 units(as the data has 2 class labels) and activation "softmax". Do you think there is an error in this as I can get the output in python (Anaconda) using keras with same configuration.

 

Thanks

Contributor I hermawan_eriadi
Contributor I

I think error "list index out of range" is nothing to do with your choice of units, Activation and then connected another layers and type. If you can see from the "Log" panel, you will know weather the error from Keras Model or when it applied in "Apply Keras Model" operator.

 

I am not sure, but maybe its because you have missing attribute in your data? You didnt share your data so I can't check it.

Contributor I chinrenjie
Contributor I

Hi, 

I would like to ask a question regarding Recurrent Neural Network.
I have 9000 datasets (7200 for training and 1800 for testing). The datasets have 4 input and 1 output.
I tried to develop a RNN model for the output prediction.
Below are the details of the Keras model:
Learning rate: 0.001; loss: mean_squared_error; optimizer: Adam; epochs: 256; batch size: 32
Core layer [Reshape to (1,4)]---> Recurrent layer [LSTM, 1 unit, tanh activation, tanh recurrent activation]--->Core layer [Dropout rate 0.25]--->Core layer [Dense, 250 unit, relu activation function]--->Core layer [Dense, 1 unit, linear activation function]

The prediction value I obtained is far away from the actual value.
I have tried to use different activation function in LSTM but the result still cannot be improved. Anyone can give suggestion regarding the problem?

 
Your help is highly appreciated. Thank you.

 

Contributor I hermawan_eriadi
Contributor I

@chinrenjie

IMHO,  according to the "No free lunch" theory. You have to try soo many alternatife in your trying to find the fittest model. For Tensorflow (Keras) model, the alternatif is even more. Including Layer (number and type) and parameters.

 

But.. from what you said above, I guess that you have only 4 attributes? [Reshape to (1,4)]. I think its to small to get more varies (and better) result. You better get more attributes, instead of add more layer.

 

Before you use Dropout Layer, better you find best accuracy (or minimum error) first, by add other layer type, than you can use Dropout to avoid overfitting.

 

Have you tried to use validation in your model? It usefull to train your model before apply it to test data.

 

Try also use Optimize Parameters (Grid) Operator, to make the combination easier.

 

Learner III varunm1
Learner III

Hi,

 

Is there any extension for supervised Hidden Markov Models for classification in RapidMiner ?

 

Thanks