The RapidMiner community is on read-only mode until further notice. Technical support via cases will continue to work as is. For any urgent licensing related requests from Students/Faculty members, please use the Altair academic forum here.

# "Linear regression beats ANN"

chaosbringer
Member Posts:

**21**Contributor II
Hi,

i have a dataset consisting of 1000 samples and 19 attributes. The data is housing data (living area, presence of heating, bath, neighborhood characteristics, etc). The target value is house price. The dataset has 8 binary attributes.

If i apply linear regression to this dataset the results are far superior to an ANN, although from my understanding the data is to complex for linear regression.

Also decission trees and SVM are inferior to linear regression.

Have you some advice, how i can validate the results and check why linear regression is that good?

Thank you very much.

i have a dataset consisting of 1000 samples and 19 attributes. The data is housing data (living area, presence of heating, bath, neighborhood characteristics, etc). The target value is house price. The dataset has 8 binary attributes.

If i apply linear regression to this dataset the results are far superior to an ANN, although from my understanding the data is to complex for linear regression.

Also decission trees and SVM are inferior to linear regression.

Have you some advice, how i can validate the results and check why linear regression is that good?

Thank you very much.

0

## Answers

537MavenA linear model with 19 parameters is still a fairly complex model.

21Contributor IIthank you for your answer.

Yes, the data is still complex. But i still do not understand, why ANN is so bad.

Even with cross validation if get:

MSQE with lin. reg: 0.34

MSQE with ANN: 0.54

Is there an explanation for this? How can i shed some light into the details? Why is ANN such bad in comparisson to lin. reg?

Thank you very much.

537MavenE.g. measure the RMSE at every iteration.

Maybe you need to train your network for many more iterations.

With 19 inputs, your network gets very big, very fast, so you have lots of weights to optimize.

An alternative problem could be premature convergence, e.g. getting stuck in local optima.

Best regards,

Wessel

21Contributor IItank you, that helped. Fiddling with the parameters improved the situation significantly.

However, another problem raises:

T-Test says, that the means are the same (p=1,0).

If i test-wise modify the parameters of the neural net to produce a realy bad result, the t-test still return 1.

How can it be, that the t-test returns 1, even though the RMSEs are very different (0.5 vs 0.34)?

Thank you

3Contributor II am used to evaluating models with AUCs, what numbers are you getting? Is MSQE the mean squared error?