The RapidMiner community is on read-only mode until further notice. Technical support via cases will continue to work as is. For any urgent licensing related requests from Students/Faculty members, please use the Altair academic forum here.
[Solved] Results of the forecasting performance operator seems illogical
Dear all,
I found several entries in the forum regarding this topic but it is still not clear to me what this operator really does.
In this post for example there is a description: http://rapid-i.com/rapidforum/index.php/topic,2680.0.html
In the attached sample process I create a label attribute with the windowing operator from att2 and feed this into a validation. Here is a fictious testing example set:
label(att2) prediction att1 att2 att3
4 5 8 1 2
7 9 2 4 0
I would assume that only one row is necessary to determine the trend per iteration.
(e.g. label(4)>att2(1) AND prediction(5)>att2(1) --> prediction trend should be correct)
But it seems to work in a different way...
- How does the forecasting performance operator actually calculates the performance?
- How does the performance operator know which attribute has been forecasted by the label, e.g. att2?
(this cannot be set in the parameters)
- Why does the operator need at least a set of two rows - if test window width is set to 1 the result is "unkown"?
(windowing has already been done in advance so that the label is yet shifted and no second row would be needed)
- What influence has the parameter "main criterion"?
Please advise...
Best regards
Sachs
The attached sample shows a simple process which is supposed to train a model on forecasting time series.
The first breakpoint shows the validation's testset after the model made its prediction.
The second breakpoint shows the corresponding performance of the model for this specific testset.
I found several entries in the forum regarding this topic but it is still not clear to me what this operator really does.
In this post for example there is a description: http://rapid-i.com/rapidforum/index.php/topic,2680.0.html
In the attached sample process I create a label attribute with the windowing operator from att2 and feed this into a validation. Here is a fictious testing example set:
label(att2) prediction att1 att2 att3
4 5 8 1 2
7 9 2 4 0
I would assume that only one row is necessary to determine the trend per iteration.
(e.g. label(4)>att2(1) AND prediction(5)>att2(1) --> prediction trend should be correct)
But it seems to work in a different way...
- How does the forecasting performance operator actually calculates the performance?
- How does the performance operator know which attribute has been forecasted by the label, e.g. att2?
(this cannot be set in the parameters)
- Why does the operator need at least a set of two rows - if test window width is set to 1 the result is "unkown"?
(windowing has already been done in advance so that the label is yet shifted and no second row would be needed)
- What influence has the parameter "main criterion"?
Please advise...
Best regards
Sachs
The attached sample shows a simple process which is supposed to train a model on forecasting time series.
The first breakpoint shows the validation's testset after the model made its prediction.
The second breakpoint shows the corresponding performance of the model for this specific testset.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.008">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.2.008" expanded="true" name="Process">
<process expanded="true" height="502" width="547">
<operator activated="true" class="subprocess" compatibility="5.2.008" expanded="true" height="76" name="Generate Data" width="90" x="45" y="30">
<process expanded="true" height="383" width="299">
<operator activated="true" class="generate_data" compatibility="5.2.008" expanded="true" height="60" name="Generate Data (2)" width="90" x="45" y="30">
<parameter key="number_examples" value="30"/>
<parameter key="number_of_attributes" value="3"/>
<parameter key="attributes_lower_bound" value="0.0"/>
</operator>
<operator activated="true" class="select_attributes" compatibility="5.2.008" expanded="true" height="76" name="Select Attributes" width="90" x="179" y="30">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="label"/>
<parameter key="invert_selection" value="true"/>
<parameter key="include_special_attributes" value="true"/>
</operator>
<connect from_op="Generate Data (2)" from_port="output" to_op="Select Attributes" to_port="example set input"/>
<connect from_op="Select Attributes" from_port="example set output" to_port="out 1"/>
<portSpacing port="source_in 1" spacing="0"/>
<portSpacing port="sink_out 1" spacing="0"/>
<portSpacing port="sink_out 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="series:windowing" compatibility="5.2.000" expanded="true" height="76" name="Windowing" width="90" x="179" y="30">
<parameter key="horizon" value="1"/>
<parameter key="window_size" value="1"/>
<parameter key="create_label" value="true"/>
<parameter key="label_attribute" value="att2"/>
</operator>
<operator activated="true" class="series:sliding_window_validation" compatibility="5.2.000" expanded="true" height="112" name="Validation" width="90" x="313" y="30">
<parameter key="training_window_width" value="5"/>
<parameter key="training_window_step_size" value="1"/>
<parameter key="test_window_width" value="2"/>
<process expanded="true" height="502" width="165">
<operator activated="true" class="support_vector_machine_linear" compatibility="5.2.008" expanded="true" height="76" name="SVM (Linear)" width="90" x="45" y="30"/>
<connect from_port="training" to_op="SVM (Linear)" to_port="training set"/>
<connect from_op="SVM (Linear)" from_port="model" to_port="model"/>
<portSpacing port="source_training" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
<portSpacing port="sink_through 1" spacing="0"/>
</process>
<process expanded="true" height="502" width="299">
<operator activated="true" class="apply_model" compatibility="5.2.008" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
<list key="application_parameters"/>
</operator>
<operator activated="true" breakpoints="before,after" class="series:forecasting_performance" compatibility="5.2.000" expanded="true" height="76" name="Performance" width="90" x="179" y="30">
<parameter key="horizon" value="1"/>
<parameter key="main_criterion" value="prediction_trend_accuracy"/>
</operator>
<connect from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
<connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
<portSpacing port="source_model" spacing="0"/>
<portSpacing port="source_test set" spacing="0"/>
<portSpacing port="source_through 1" spacing="0"/>
<portSpacing port="sink_averagable 1" spacing="0"/>
<portSpacing port="sink_averagable 2" spacing="0"/>
</process>
</operator>
<connect from_op="Generate Data" from_port="out 1" to_op="Windowing" to_port="example set input"/>
<connect from_op="Windowing" from_port="example set output" to_op="Validation" to_port="training"/>
<connect from_op="Validation" from_port="averagable 1" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
Tagged:
0
Answers
Your example is seriously complicated.
Why make att2 the label attribute, why not att1?
Better have an example that is so logical, there is only 1 way to do it.
Time series data is almost never normally distributed.
Which is what your random number generator makes.
Why not make a sine curve or something...
I know I'm a bit harsh asking these questions.
But its really hard to understand how a validation process works without having a good forecasting task first.
Your questions are valid and I will try to explain in a better way - I am glad that there are people out there who take their time to help others and keep improving Rapidminer! That being said, I come back to the topic.
1) How does the forecasting performance operator actually calculates the performance?
According to your proposal, I simplified the process in a way that there is only one attribute which represents a normal parabola based on the function f(x)=(x-15)^2. (There might be easier ways to generate functions with operators but I don't know how and I am not sure in how far a normal distribution impacts the result. Anyway, I hope this will do.)
To make it easier to reproduce the data please press 10 times F11 key (run process). The result is:
label prediction att1
0 16.2 1
1 15,9 0
According to my understanding the calculation should be like:
- att1 (here 1) is today's value
- label (here 0) is the value of tommorow (because of the windowing done before)
- prediction is the result of the applied model
-> if label < att1 (today) the real trend is down
-> if prediction > att1 the predicted trend is up
-->> therefore, the prediction trend accuracy should be 0 for this iteration but it is 1 (press F11 again).
2) Why does the operator need at least a set of two rows? If test window width in the "validation" operator is set to 1 the result is "unkown"
Using the method described above it should be sufficient to have only a single row in the test set to determine the prediction trend.
3) How does the performance operator know which attribute has been forecasted by the label?
In this meanwhile simplified data set I have only one attribute. But imagine that there are three. The SVM learns how these three attributes contribute in predicting the label. The label has been created by windowing one of the three attributes. This specific attribute has today's value. So how the "prediction forecast" operator knows which attribute is the right one to compare label and prediction with in order to determine the trend?
4) What influence has the parameter "main criterion"?
I can set the parameter "main cirterion" either to "first" or "prediction trend accuracy". But additionally, there is a checkbox "prediction trend accuracy". Does it have the same effect? What does "first" stand for?
These are quite a lot of question but they all might refer to the same catch.
Meanwhile I spent days trying to understand the "forecast prediction" operator, so I would be happy about any comment.
Best regards
Sachs
# label att1
1 169 196
2 144 169
3 121 144
4 100 121
5 81 100
6 64 81
7 49 64
8 36 49
9 25 36
10 16 25
11 9 16
12 4 9
13 1 4
14 0 1
15 1 0
16 4 1
17 9 4
18 16 9
19 25 16
20 36 25
21 49 36
22 64 49
23 81 64
24 100 81
25 121 100
26 144 121
27 169 144
28 196 169
29 225 196
Let's imagine that att1 is the temperature 1 hour ago.
And label is the current temperature.
The task is temperature prediction 1 hour ahead in time.
I'm changing your process to use k=1 Nearest Neighbors as the machine learning algorithm, instead of SVM.
You decided to use prediction trend accuracy as a measure of performance, using a test window width of 2 and a training window width of 8.
So now we can just look at the data and know what the outcome should be.
You explicitly reference data points 14 and 15 in your data (this is the only data you posted).
14 0 1
15 1 0
Since 13 is the nearest neighbor for both data-points, you get 4 as a prediction for both points.
Since 4 !=0 and 4 != 1 you get a prediction trend accuracy of 0.
So to answer your questions.
1)
Prediction trend accuracy basically converts a regression task into a classification task.
If you predicted the temperature to go up (and it did go up) you get accuracy 100%, else you get accuracy 0%.
Note that it is possible to achieve the same effect by manually generating a label attribute which is binary and then use classification accuracy as a measure of performance. The prediction trend accuracy is just a shortcut, so you don't have to kludge your process with too many operators.
2)
You need 2 rows, because you can only compute a trend when you have 2 rows (how else to decide if the temperature went up or down).
3)
This question is irrelevant, the k-NN operator combines all inputs to form 1 prediction output. This prediction output attribute is typically named prediction(label). Performance is always calculated using prediction(label) and label. If this is not the case, it would be wise to change the naming of your attributes to make it so.
4)
Since forecasting performance only has 1 performance measure: prediction trend accuracy, the main criterion is always prediction trend accuracy. But maybe this operator will be extended to have additional measures of performance in the future (e.g. sMAPE). Look at the regression and classification performance operators. There the main criterion drop down box tells the outer operator which measure of performance to use. Maybe you want to look at both accuracy and correlation, but only use accuracy to decide which learner is best.
To give a more detailed example for other readers to make it more clear:
I tought that trend accuracy works like in the following example:
label prediction attribute
0 24 1
label = tomorrow
prediction = prediction of tomorrow
attribute = today
if ((tomorrow>today) AND (prediction of tomorrow>today)) then "trend is true" ELSE "trend is false"
The same is true for "<" as well as long as it's the same on both sides. Another equal notation is:
if ((label>attribute) AND (prediction>attribute)) then "trend is true" ELSE "trend is false"
But in case that there is only a label and the prediction given, one need two samples:
row label prediction
a 1 25
b 0 24
label{b} = tomorrow
prediction{b} = prediction of tomorrow
label{a} = today
if ((label{b}>label{a}) AND (prediction{b}>label{a})) then "trend is true" ELSE "trend is false"
Then the only remaining question is what the parameter "main criterion" does...
Thanks a lot & all the best
Sachs
Please check out my previous post, it should be updated now.
On a side note, I think you only want to use the prediction trend accuracy operator when you have a very big test window width.
Generally this is only the case when you have a very large number of examples in your dataset.
For smaller data sets its better to use a test window width of 1.
And then simply use mean absolute error as a measure of performance.
If you want to compute some normalized error simply do your validation twice:
- Using your own learner
- Using some stupid learner (e.g. zero rule)
This way you can actually interpret the results you are getting.
Lets say you find that your learner as a mean absolute error of 500.
And the zero rule mean absolute error is 1000.
Then your learner performance twice as good compared to the performance of zero rule (which is the performance as expected by chance alone).
Furthermore, computing trends on only two examples is very restricted.
Look at the source code:
https://rapidminer.svn.sourceforge.net/svnroot/rapidminer/Plugins/ValueSeries/Unuk/src/com/rapidminer/operator/performance/PredictionTrendAccuracy.java
* Measures the number of times a regression prediction correctly determines the trend. This performance measure assumes that the attributes
* of each example represents the values of a time window, the label is a value after a certain horizon which should be predicted. All
* examples build a consecutive series description, i.e. the labels of all examples build the series itself (this is, for example, the case
* for a windowing step size of 1). This format will be delivered by the Series2ExampleSet operators provided by RapidMiner.
* This performance measure then calculates the actual trend between the last time point in the series (T3 here) and the actual label (L)
* and compares it to the trend between T3 and the prediction (P), sums the products between both trends, and divides this sum by the total
* number of examples, i.e. [(if ((v4-v3)*(p1-v3)>=0), 1, 0) + (if ((v5-v4)*(p2-v4)>=0), 1, 0) +...] / 7 in this example.
My interpretation after reading this source code, is that it is best to avoid the prediction trend accuracy operator.
Only use prediction trend accuracy if other measures of performance fail.
I don't have the words to thank you enough for your patients! This was a really outstanding coaching lesson for me! Thank you so much!
Best regards
Sachs