Linear Regression Prediction - wrong calculation?

rsorso Member Posts: 5 Contributor II
edited November 2018 in Help
Hi - I've used Linear regression to analyze the below data

Row, interest_rate ('label'), credit_score
1 7.31 500.0
2 6.7 600.0
3 5.95 700.0
4 6.4 700.0
5 5.4 800.0
6 5.7 800.0
7 5.9 750.0
8 7.0 550.0
9 6.5 650.0
10 5.7 825.0

The output Linear Regression Model in "Description" says:
- 0.005 * credit_score
+ 10.000

When I applied the above model to the same data set, I've got the below output:

Row No, interest_rate, prediction(interest_rate), credit_score
1 7.31 7.277008403361254 500.0
2 6.7 6.732470588235252 600.0
3 5.95 6.18793277310925 700.0
4 6.4 6.18793277310925 700.0
5 5.4 5.643394957983249 800.0
6 5.7 5.643394957983249 800.0
7 5.9 5.915663865546249 750.0
8 7.0 7.004739495798253 550.0
9 6.5 6.460201680672251 650.0
10 5.7 5.507260504201748 825.0

Using the Linear Model, the predicted interest_rate for credit_score 500 (row 1 data) should be calculated as:

-0.005 * 500 + 10 = 7.5

Please share any thought about the discrepancy in prediction  (7.5 vs. 7.2770 as in row 1)

Thank you

Tagged:

Answers

  • David_ADavid_A Administrator, Moderator, Employee, RMResearcher, Member Posts: 297 RM Research
    Hi,

    the results are correct.
    The apparently discrepancy you see results in how RapidMiner shows you the results.
    In the result view all numbers are rounded to four decimal places. If you copy and paste the coefficients in another editor you will see, that the actual results are:
    -0.0054453781512600165 and 9.999697478991262. And with these numbers you get:
    -0.0054453781512600165* 500 +  9.999697478991262 = 7.277008.

    You can set the number of digits displayed under Settings -> Preferences -> General.

    Best,
    David

  • haddockhaddock Member Posts: 849 Maven
    Great answer.

    H
  • rsorso Member Posts: 5 Contributor II
    David, thank you very much.
Sign In or Register to comment.