Linear Regression Prediction - wrong calculation?

rsorso Member Posts: 5 Contributor I
edited November 2018 in Help
Hi - I've used Linear regression to analyze the below data

Row, interest_rate ('label'), credit_score
1 7.31 500.0
2 6.7 600.0
3 5.95 700.0
4 6.4 700.0
5 5.4 800.0
6 5.7 800.0
7 5.9 750.0
8 7.0 550.0
9 6.5 650.0
10 5.7 825.0

The output Linear Regression Model in "Description" says:
- 0.005 * credit_score
+ 10.000

When I applied the above model to the same data set, I've got the below output:

Row No, interest_rate, prediction(interest_rate), credit_score
1 7.31 7.277008403361254 500.0
2 6.7 6.732470588235252 600.0
3 5.95 6.18793277310925 700.0
4 6.4 6.18793277310925 700.0
5 5.4 5.643394957983249 800.0
6 5.7 5.643394957983249 800.0
7 5.9 5.915663865546249 750.0
8 7.0 7.004739495798253 550.0
9 6.5 6.460201680672251 650.0
10 5.7 5.507260504201748 825.0

Using the Linear Model, the predicted interest_rate for credit_score 500 (row 1 data) should be calculated as:

-0.005 * 500 + 10 = 7.5

Please share any thought about the discrepancy in prediction  (7.5 vs. 7.2770 as in row 1)

Thank you

Tagged:

Answers

  • David_ADavid_A Moderator, Employee, RMResearcher, Member Posts: 177  RM Research
    Hi,

    the results are correct.
    The apparently discrepancy you see results in how RapidMiner shows you the results.
    In the result view all numbers are rounded to four decimal places. If you copy and paste the coefficients in another editor you will see, that the actual results are:
    -0.0054453781512600165 and 9.999697478991262. And with these numbers you get:
    -0.0054453781512600165* 500 +  9.999697478991262 = 7.277008.

    You can set the number of digits displayed under Settings -> Preferences -> General.

    Best,
    David

  • haddockhaddock Member Posts: 849  Guru
    Great answer.

    H
  • rsorso Member Posts: 5 Contributor I
    David, thank you very much.
Sign In or Register to comment.