Interpreting Deep Learning Models

AizatAlam_129AizatAlam_129 Member Posts: 14 Contributor II
Newbie here, can anyone share with me how to interpret hidden layers for Deep Learning models?

Best Answers

  • jacobcybulskijacobcybulski Member, University Professor Posts: 391 Unicorn
    Solution Accepted
    By the way, the following article has examples from Tensorflow and Python (rather than RapidMiner), but it shows how to tell if you have some gradient-related issues in your deep learning:
  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    Solution Accepted
    IMHO, I think for most people, "understanding" what is going on inside a NN is nearly impossible, especially if you are looking at parsing out differences between specific layers.  NN by design are replicating complex variable interactions in high dimensional space. You are better off using some post-modeling tools to understand variable importance and sensitivity in the model as a whole.  RapidMiner has several of those operators available for your use, including Explain Predictions and Model Simulator.
    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts


  • jacobcybulskijacobcybulski Member, University Professor Posts: 391 Unicorn
    edited January 2021
    Analysing hidden layers is considered the "black art" of deep learning. In convolutional neural nets, hidden layers can be directly interpreted as they tend to visually represent the input features of increasing abstraction. There exist some research tools that could help you debug such hidden layers. In recursive deep models, the hidden layers are stacks of sequences and very hard (impossible) to analyse. It is multilayer perceptrons (neural nets of fully connected layers) where the insight into their properties can help debugging some problems with deep neural net learning. This may be useful when dealing with situations when the neural net stops learning due to zero gradients or exploding gradients (the hidden layer weights becoming zero or infinite). By looking at the distribution of weights and biases over time, you may see that weights become very small or very large at some point in time, and at the same time the training performance plummets. In such cases, you may identify which layer is prone to this and introduce either weight regularisation for this particular layer or a batch regularising layer before (other options are possible, e.g. changing the weight initialisation, learning rate, or activation function). 

    My advice is: before you dive into analysing hidden layers, first learn how to analyse the model learning over time, both training and validation!

    In RapidMiner, if you use the Deep Learning extension, you can have a look at what is going on in the network training, epoch by epoch and layer by layer. Place a break point after the Deep Learning operator and run the process. Make sure to make the log panel visible (View > Panel > Log). You will then see a message of the kind: Training UI available at: http://localhost:39725 (that URL is unique for each run). Open the URL and you'll access the DL4J Training UI window, which is extremely useful when you have a very long deep model training session and you can view the training performance and it is useful to check the model architecture and its evolving parameters. If you do not place that breakpoint and let the process finish, unfortunately at the end the DL4J Training UI "server" process also completes and the info is no longer accessible. Also beware that if you change and play with your model, and never let the process finish, your browser caches the same URL and you will not be able to see the changes, in which case let the process run to the finish at some point.

    Enjoy, watching your deep model learn -- Jacob
Sign In or Register to comment.