The RapidMiner community is on read-only mode until further notice. Technical support via cases will continue to work as is. For any urgent licensing related requests from Students/Faculty members, please use the Altair academic forum here.

# Another kind of performance measurement for time series

Member Posts: 130 Contributor II
edited June 2019 in Help

Dear all,

Since no one seems to have an idea for a workaround, I would like to bring up this topic again as a feature request.
Original post: http://rapid-i.com/rapidforum/index.php/topic,6399.msg22389.html#msg22389

Especially in financial data mining one would build a model not on the actual stock price but on the difference to the last day.
Consequently, the result of a prediction process will be an estimation about the change of the price from one day until the next.

The currently available "forecasting performance" operator for series determines whether the prediction trend is correct.
(e.g. delta[today] = 4; delta[prediction for tomorrow] = 6; delta[tomorrow] = 5 >> trend is true because tomorrow>today AND prediction>today)

In order to determine win/loss this is not sufficient.
(e.g. delta[today] = -4; delta[prediction for tomorrow] = -3; delta[tomorrow] = -2 >> trend is true but the share still loses value)

Hence, the main question rather is wether delta[tomorrow] will be positive or negative.
(e.g. delta[prediction for tomorrow] = -3; delta[tomorrow] = -2 >> trend should be true because prediction and tomorrow have the same sign)
(e.g. delta[prediction for tomorrow] = 4; delta[tomorrow] = -1 >> trend should be false)
(e.g. delta[prediction for tomorrow] = 1; delta[tomorrow] = 3 >> trend should be true)

Can anyone help how to realize this kind of performance measurement?

PS: With the existing operator I discovered pretty good prediction trend accuracy rates of 0.7 to 0.8 but the overall win/loss simulation was only slightly above 0.5 due to the issue described above. So I was wondering whether another data preprocessing could help (e.g. transform the stock values into binominal data like "up" and "down" but SVMs are not able to handle binominal data). So far I calculate the daily percental change for all attributes and the label. The best correlating attributes are then used to build a model in the SVM. Does anyone happen to know wether there are other essential steps in preprocessing to improve prediction quality?

Kind regards
Sachs