Linear Regression example from textbook by Kotu and Deshpande 2015

jacyjacy Member Posts: 2 Newbie
edited August 2020 in Help

I'm new to RapidMiner. I am trying to follow the Linear Regression steps from p172-179 Ch5 in the textbook "Predictive Analytics and Data Mining: Concepts and Practice with RapidMiner" (2015). The textbook is quite old and this is possibly an outdated way of performing this action? 

Unfortunately I am too new to the community to post an image of the process but if you google "Figure 5.10. Setting up a process to do the comparison between the unseen data and the model predicted values", the screenshot is at the bottom right of the science direct page returned. 

The step I cannot get to work is the last Generate Attributes step. An attribute is created to calculate the difference between the predicted value and the actual value. The histogram of this difference is then viewed to check the distribution. 

The textbook says to enter the formula: (predictedMEDV-MEDV)
Both of these are listed as "special attributes" when I view the function expressions input list 

The error I receive when running the process is: "The attribute MEDV is unknown" 
When I view the example set, the field MEDV does not display. 
But when I view the input port on the Generate Attributes operator, I can see the MEDV field there with a "prediction" label. 

Any tips? 

ETA: I have ended up generating an ID for the original data set and joining back to this at the end, so I can compare the original attribute with the predicted attribute



  • jacobcybulskijacobcybulski Member, University Professor Posts: 391 Unicorn
    I suggest that rather than typing the formula in Generate Attribute, click on the "calculator" and you'll be able to select the attributes from their list to construct the formula. In this way you'd avoid making mistakes. Jacob 
  • jacyjacy Member Posts: 2 Newbie
    thanks Jacob. I tried this too but it just didn't work, even tho the attributes were available to click on. In a different process, I tried to generate the residual again and it worked fine, so I'm really not sure why this particular process won't work. It's ok as I was just doing it as an example to try and follow the book. Thanks for your reply
  • jacobcybulskijacobcybulski Member, University Professor Posts: 391 Unicorn
    @jacy RapidMiner has changed a lot since that edition of the book by Kotu and Deshpande, for example RapidMiner no longer "gets confused" when you use the prediction attribute in the formula. What you may wish to do is to look at the second and much improved edition of the book, which is now called "Data Science: Concepts and Practice", which I highly recommend. I used both editions in my teaching and I have been very happy with examples included in both. So good luck! Jacob
Sign In or Register to comment.