RapidMiner 9.7 is Now Available

Lots of amazing new improvements including true version control! Learn more about what's new here.


Conducting a "Key Driver" analysis in RapidMiner

RSinclairRSinclair Member Posts: 4 Learner I
edited December 2018 in Help
I am looking for instruction/tutorial on how to go about conducting a "Key Driver" analysis using Rapidminer.   I  was told it could easily be done by RM team members at Wisdom 2018, but there is no instruction that I can find on the website that goes into any detail on how this type of analysis is done in RM.  Help is much appreciated.



  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,073   Unicorn
    Hi @RSinclair,

    I just discovered on the web what a "Key Driver" analysis is.
    If I good understood it is a calculation of the correlation of different attributes to your target variable...
    So I propose to use the RapidMiner's Correlation Matrix operator. 
    If I misunderstood, thanks to correct me, and explain more explicitly what are your data and what you want to obtain.



  • RSinclairRSinclair Member Posts: 4 Learner I
    Hi Lionel,

    I apologize for the late acknowledgment of your response - I must have missed the email informing that someone had addressed my post.  To answer your question, it involves a bit more than a correlation matrix.

    Multiple linear regression is the most common technique to compute a Key Driver Analysis (KDA). Multiple linear regression analysis is one of the “workhorses” of multivariate analysis. It works by examining the correlations between independent variables to generate the best linear combination to predict the outcome variable. It provides a model “fit” using R-squared, which tells you how well the independent variables predict the dependent variable. For example, an R-squared value of .50 means the independent variables explain 50% of the variance in the dependent variable. 

    Thanks for the response,

  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,749  RM Founder
    This is VERY close to what the operator "Explain Predictions" is doing.  To get those results, you could train a linear model first and then use Explain Predictions for this model which use the local correlations to show the contributions of the independent variables for the prediction of the dependent variable.  The process below shows a simple example for this.  If you go with a more complex or non-linear model you will actually see some more interesting results but the concept is the same...
    Hope this helps,

    <?xml version="1.0" encoding="UTF-8"?><process version="9.2.000">

Sign In or Register to comment.