🎉 🎉 RAPIDMINER 9.10 IS OUT!!! 🎉🎉

Download the latest version helping analytics teams accelerate time-to-value for streaming and IIOT use cases.


"Improve root_mean_squared_error in results with testing dataset"

pepe_jaenpepe_jaen Member Posts: 1 Contributor I
edited May 2019 in Help
Hello all.
This is my first topic in this Forum, i will be very pleased to receive any  help in my Rapidminer process  :) .

I´m trying to create a model to estimate a float label using 9 attributes in a data set of 430 example. Those 9 attributes were selected previously using wrapper valitation and  also correlation matrix.
I was training a model with SVM with the first 380 examples. After use "loop parameters", I obtained SVM [kernel = epachenikov; kernel cache=200; C=10.0; convergence epsilon =0.001 ...].
With this parameters from SVM I was able to reach a root_mean_squared_error of 0.0001 (using "apply model" + "performance" operators). The label and the predicted label with this model is practically the same (100% performance).

If I test the rest of the dataset (49 examples), the  root_mean_squared_error is very high  (1.411), and the prediction is not close to the label value.

I was also using windowing with horizon=1 and sliding windowing validation with the same result, 100%performance in training dataset and very low perfromance in the testing dataset.

Is there any posibility of train the model looking for improve the performance of the predicted label with the testing dataset (last 49 examples)?.
Another idea in order to improve.  ???

Thanks in advance.
Sign In or Register to comment.