Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
"Time series forecast (with Rapid Miner)"
Hi!
I've set up a model exactly as described by Thomas Ott of 'neuralmarkettrends' in videos 8-10 - and it's working well so far.
But what I would still need is the output of the probability for the predicted label (horizon = 1). The model only gives the average values in form of
prediction_trend_accuracy: 0.807 +/- 0.067 (mikro: 0.807).
Thanks for your help !
I've set up a model exactly as described by Thomas Ott of 'neuralmarkettrends' in videos 8-10 - and it's working well so far.
But what I would still need is the output of the probability for the predicted label (horizon = 1). The model only gives the average values in form of
prediction_trend_accuracy: 0.807 +/- 0.067 (mikro: 0.807).
Thanks for your help !
Tagged:
0
Answers
I'm now using Google to find the video you describe.
Next time please use a direct link to the video that is of interest.
Video link:
https://www.youtube.com/watch?v=UmGIGEJMmN8
Can you upload your process?
As far as I understand the process is as follows:
- Order your data by date
- Split your data into two parts
- Use data before date X for training, use data after date X for testing.
- Features for training use created using windowing
- SVM is used as learner
* This process does not deal with horizons very well, neuralmarkettrends1 is aware of this fact, but does not want to complicate his video
Now to answer your question:
My suggestion would be to rescale absolute error to fall into range 0 to 1, and use this as a measure of probability.
This is the best answer I can give right now.
You need to provide better information to get a better answer.
Best regards,
Wessel
You are right the question was a bit too unprecise, however you got it right that's the way I'm doing it.
Unfortunately I don't know what to do exactly regarding your answer "Now to answer your question:
My suggestion would be to rescale absolute error to fall into range 0 to 1, and use this as a measure of probabilit".
Where do I get the absolute error from ?
Thank you in advance !
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.008">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.3.008" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="generate_data" compatibility="5.3.008" expanded="true" height="60" name="Gen TS" width="90" x="45" y="30">
<parameter key="target_function" value="driller oscillation timeseries"/>
</operator>
<operator activated="true" class="generate_attributes" compatibility="5.3.008" expanded="true" height="76" name="Create Sum" width="90" x="180" y="30">
<list key="function_descriptions">
<parameter key="sum" value="str(11*att1+22*att2+33*att3+44*att4+att5)"/>
</list>
</operator>
<operator activated="true" class="guess_types" compatibility="5.3.008" expanded="true" height="76" name="Guess Types" width="90" x="315" y="30"/>
<operator activated="true" class="select_attributes" compatibility="5.3.008" expanded="true" height="76" name="Select Sum" width="90" x="450" y="30">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="sum"/>
</operator>
<operator activated="true" class="normalize" compatibility="5.3.008" expanded="true" height="94" name="Normalize" width="90" x="585" y="30">
<parameter key="method" value="range transformation"/>
</operator>
<operator activated="true" class="series:windowing" compatibility="5.3.000" expanded="true" height="76" name="Win 3 2" width="90" x="720" y="30">
<parameter key="window_size" value="3"/>
<parameter key="create_label" value="true"/>
<parameter key="label_attribute" value="sum"/>
<parameter key="horizon" value="2"/>
</operator>
<operator activated="true" class="series:predict_series" compatibility="5.3.000" expanded="true" height="60" name="Predict: 22 5 22" width="90" x="45" y="120">
<parameter key="window_width" value="15"/>
<parameter key="horizon" value="2"/>
<parameter key="max_training_set_size" value="15"/>
<process expanded="true">
<operator activated="true" class="relevance_vector_machine" compatibility="5.3.008" expanded="true" height="76" name="Relevance Vector Machine" width="90" x="45" y="30"/>
<connect from_port="window example set" to_op="Relevance Vector Machine" to_port="training set"/>
<connect from_op="Relevance Vector Machine" from_port="model" to_port="prediction model"/>
<portSpacing port="source_window example set" spacing="0"/>
<portSpacing port="sink_prediction model" spacing="0"/>
</process>
</operator>
<operator activated="true" class="rename" compatibility="5.3.008" expanded="true" height="76" name="Rename" width="90" x="180" y="120">
<parameter key="old_name" value="prediction(label)"/>
<parameter key="new_name" value="pred"/>
<list key="rename_additional_attributes"/>
</operator>
<operator activated="true" class="generate_attributes" compatibility="5.3.008" expanded="true" height="76" name="Generate Attributes" width="90" x="315" y="120">
<list key="function_descriptions">
<parameter key="pred_times_label" value="pred*label"/>
<parameter key="pred_times_label_greater_0" value="if(pred*label>=0, 1, 0)"/>
<parameter key="abs_pred_minus_label" value="abs(pred-label)"/>
</list>
</operator>
<operator activated="true" class="extract_performance" compatibility="5.3.008" expanded="true" height="76" name="Performance" width="90" x="469" y="119">
<parameter key="performance_type" value="statistics"/>
<parameter key="attribute_name" value="abs_pred_minus_label"/>
</operator>
<connect from_op="Gen TS" from_port="output" to_op="Create Sum" to_port="example set input"/>
<connect from_op="Create Sum" from_port="example set output" to_op="Guess Types" to_port="example set input"/>
<connect from_op="Guess Types" from_port="example set output" to_op="Select Sum" to_port="example set input"/>
<connect from_op="Select Sum" from_port="example set output" to_op="Normalize" to_port="example set input"/>
<connect from_op="Normalize" from_port="example set output" to_op="Win 3 2" to_port="example set input"/>
<connect from_op="Win 3 2" from_port="example set output" to_op="Predict: 22 5 22" to_port="example set"/>
<connect from_op="Predict: 22 5 22" from_port="example set" to_op="Rename" to_port="example set input"/>
<connect from_op="Rename" from_port="example set output" to_op="Generate Attributes" to_port="example set input"/>
<connect from_op="Generate Attributes" from_port="example set output" to_op="Performance" to_port="example set"/>
<connect from_op="Performance" from_port="performance" to_port="result 1"/>
<connect from_op="Performance" from_port="example set" to_port="result 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
</process>
( I have problems uploading images, will edit this image later, just go into results dataset and plot "predicted" and "label" and maybe "abs_pred_minus_label" ).
Try figure out why absolute error is different from average(abs_pred_minus_label)
Also note that I'm not using a fixed split, instead I'm using a sliding window validation, because this is the proper way to validate time series models).
This XML shows how you can use the Regression Performance Operator.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.008">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.3.008" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="subprocess" compatibility="5.3.008" expanded="true" height="76" name="Generate Data (6)" width="90" x="45" y="30">
<process expanded="true">
<operator activated="true" class="generate_data" compatibility="5.3.008" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30">
<parameter key="target_function" value="driller oscillation timeseries"/>
<parameter key="number_examples" value="200"/>
</operator>
<operator activated="true" class="generate_attributes" compatibility="5.3.008" expanded="true" height="76" name="Generate Sum" width="90" x="180" y="30">
<list key="function_descriptions">
<parameter key="sum" value="str(11*att1+22*att2+33*att3+44*att4+att5)"/>
</list>
</operator>
<operator activated="true" class="select_attributes" compatibility="5.3.008" expanded="true" height="76" name="Select Sum" width="90" x="319" y="29">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="sum"/>
</operator>
<operator activated="true" class="parse_numbers" compatibility="5.3.008" expanded="true" height="76" name="Parse Numbers (2)" width="90" x="441" y="26">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="sum"/>
</operator>
<operator activated="true" class="normalize" compatibility="5.3.008" expanded="true" height="94" name="Normalize" width="90" x="561" y="27">
<parameter key="method" value="range transformation"/>
</operator>
<operator activated="true" class="rename" compatibility="5.3.008" expanded="true" height="76" name="Rename Label" width="90" x="699" y="28">
<parameter key="old_name" value="sum"/>
<parameter key="new_name" value="label"/>
<list key="rename_additional_attributes"/>
</operator>
<connect from_op="Generate Data" from_port="output" to_op="Generate Sum" to_port="example set input"/>
<connect from_op="Generate Sum" from_port="example set output" to_op="Select Sum" to_port="example set input"/>
<connect from_op="Select Sum" from_port="example set output" to_op="Parse Numbers (2)" to_port="example set input"/>
<connect from_op="Parse Numbers (2)" from_port="example set output" to_op="Normalize" to_port="example set input"/>
<connect from_op="Normalize" from_port="example set output" to_op="Rename Label" to_port="example set input"/>
<connect from_op="Rename Label" from_port="example set output" to_port="out 1"/>
<portSpacing port="source_in 1" spacing="0"/>
<portSpacing port="sink_out 1" spacing="0"/>
<portSpacing port="sink_out 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="series:windowing" compatibility="5.3.000" expanded="true" height="76" name="Win 3 2" width="90" x="187" y="32">
<parameter key="window_size" value="3"/>
<parameter key="create_label" value="true"/>
<parameter key="label_attribute" value="label"/>
<parameter key="horizon" value="2"/>
</operator>
<operator activated="true" class="multiply" compatibility="5.3.008" expanded="true" height="94" name="Multiply" width="90" x="309" y="34"/>
<operator activated="true" class="series:sliding_window_validation" compatibility="5.3.000" expanded="true" height="112" name="Validation" width="90" x="515" y="30">
<parameter key="training_window_width" value="15"/>
<parameter key="test_window_width" value="1"/>
<parameter key="horizon" value="2"/>
<parameter key="average_performances_only" value="false"/>
<process expanded="true">
<operator activated="true" class="relevance_vector_machine" compatibility="5.3.008" expanded="true" height="76" name="Relevance VM (2)" width="90" x="152" y="50"/>
<connect from_port="training" to_op="Relevance VM (2)" to_port="training set"/>
<connect from_op="Relevance VM (2)" from_port="model" to_port="model"/>
<portSpacing port="source_training" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
<portSpacing port="sink_through 1" spacing="0"/>
</process>
<process expanded="true">
<operator activated="true" class="apply_model" compatibility="5.3.008" expanded="true" height="76" name="Apply Model" width="90" x="91" y="12">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="performance_regression" compatibility="5.3.008" expanded="true" height="76" name="Performance" width="90" x="282" y="61">
<parameter key="root_mean_squared_error" value="false"/>
<parameter key="absolute_error" value="true"/>
</operator>
<connect from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
<connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
<portSpacing port="source_model" spacing="0"/>
<portSpacing port="source_test set" spacing="0"/>
<portSpacing port="source_through 1" spacing="0"/>
<portSpacing port="sink_averagable 1" spacing="0"/>
<portSpacing port="sink_averagable 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="series:predict_series" compatibility="5.3.000" expanded="true" height="60" name="Predict: 22 5 22" width="90" x="78" y="331">
<parameter key="window_width" value="15"/>
<parameter key="horizon" value="2"/>
<parameter key="max_training_set_size" value="15"/>
<process expanded="true">
<operator activated="true" class="relevance_vector_machine" compatibility="5.3.008" expanded="true" height="76" name="Relevance VM" width="90" x="412" y="29"/>
<connect from_port="window example set" to_op="Relevance VM" to_port="training set"/>
<connect from_op="Relevance VM" from_port="model" to_port="prediction model"/>
<portSpacing port="source_window example set" spacing="0"/>
<portSpacing port="sink_prediction model" spacing="0"/>
</process>
</operator>
<operator activated="true" class="rename" compatibility="5.3.008" expanded="true" height="76" name="Rename" width="90" x="263" y="330">
<parameter key="old_name" value="prediction(label)"/>
<parameter key="new_name" value="pred"/>
<list key="rename_additional_attributes"/>
</operator>
<operator activated="true" class="generate_attributes" compatibility="5.3.008" expanded="true" height="76" name="Generate Attributes" width="90" x="439" y="335">
<list key="function_descriptions">
<parameter key="abs_pred_minus_label" value="abs(pred-label)"/>
</list>
</operator>
<operator activated="true" class="extract_performance" compatibility="5.3.008" expanded="true" height="76" name="Performance (2)" width="90" x="657" y="349">
<parameter key="performance_type" value="statistics"/>
<parameter key="attribute_name" value="abs_pred_minus_label"/>
</operator>
<connect from_op="Generate Data (6)" from_port="out 1" to_op="Win 3 2" to_port="example set input"/>
<connect from_op="Win 3 2" from_port="example set output" to_op="Multiply" to_port="input"/>
<connect from_op="Multiply" from_port="output 1" to_op="Validation" to_port="training"/>
<connect from_op="Multiply" from_port="output 2" to_op="Predict: 22 5 22" to_port="example set"/>
<connect from_op="Validation" from_port="averagable 1" to_port="result 1"/>
<connect from_op="Predict: 22 5 22" from_port="example set" to_op="Rename" to_port="example set input"/>
<connect from_op="Rename" from_port="example set output" to_op="Generate Attributes" to_port="example set input"/>
<connect from_op="Generate Attributes" from_port="example set output" to_op="Performance (2)" to_port="example set"/>
<connect from_op="Performance (2)" from_port="performance" to_port="result 2"/>
<connect from_op="Performance (2)" from_port="example set" to_port="result 3"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
<portSpacing port="sink_result 4" spacing="0"/>
</process>
</operator>
</process>
Thank you so much for your answer. Due to the fact that I'm a beginner I don't know how to import your data as a new operator into my process of video 8 to 10 & I'm not sure at which position of the chain to position this operator then.
Best regards, Dai Wizard!
Create new perspective.
In show view, tick XML, untick all others.
In XML tab:
Paste XML code
Click green V symbol.
Return to your standard view.
Thank you wessel for your tips but I'm afraid it looks too complicated for me, I think I cannot handle (understand) it completely. Therefore I've created a PDF - file that you could view using this link: http://www.professor-heusenstamm.com/model.pdf
Bild 1 shows my original process, Bild 2 is the content of the validation operator.
Bild 3 shows the general performance output.
Bild 4 is my latest progress :-) I've inserted the "Log - Operator" and defined here the values for performance and prediction accuracy.
Bild 5 shows the result of the latter.
My question is: Did I insert the Log - operator at the correct position in the process (Bild4) to be sure it delivers the performance of the predicted n+1 value, that's content of "Read Excel (2)" or do I have to rearrange / add something ???
As usual I'm looking forward to anybodies comments.
http://i.snag.gy/STABy.jpg
I used this button to create a new perspective (I named this perspective XML):
http://i.snag.gy/A53kc.jpg
So now my screen looks like:
http://i.snag.gy/6QXgV.jpg
This is easy for sharing processes.