How future predictions can be made with a Time Series model in RapidMiner?

luc_bartkowskiluc_bartkowski Member Posts: 46 Maven
edited December 2018 in Help

I guess this topic is the most asked question regarding RapidMiner Time Series Prediction. Some examples:

We all ask the same question.

We want to be able to do predictions for tomorrow, next week(s), next month(s), whatever the horizon and the dimension of time is.

Some have even asked the same question multiple times in their topic/post as if the question is not clear.

Therefore the following picture, it illustrates the question.

 

rmcomq.jpeg

 

How to:

  • Calculate the prediction on Oct 5 (black markup);
  • Using the "-0 attributes" from the Windowing operator (blue markup);
  • In order to predict (orange arrow) the unknown future Last value on Oct 5 (red markup);
  • In the same way the "-0 attributes" (brown markup) are used to calculate the predictions (yellow markup) in the train/validation/test example set;
  • But without being able to use the unknown future Last value (red markup) as a label (green markup)?

The only answer with a possible solution is from @Thomas_Ott: http://community.rapidminer.com/t5/Getting-Started-Forum/Time-Series-Forecasting-for-Data/m-p/37315 . His answer links to a XML RM-process in http://community.rapidminer.com/t5/RapidMiner-Studio-Forum/Recall-Error/m-p/37302#U37302That XML implements a  complex process including manipulation of macros, multiple windowing operators in series, remember/recall and loop operators and even a "Materialize Data" operator to free-up memory in RapidMiner. The process is also based on the Yahoo Historical Data operator that unfortunately doesn't work anymore. I'm therefore not even sure if this process answers the question of this topic. Is there a more simple process/solution available to answer the question of this topic? 

 

Thanks,

Luc

 

Best Answer

  • luc_bartkowskiluc_bartkowski Member Posts: 46 Maven
    Solution Accepted

    Happy to do so Martin.

     

    To be honest: don't know much yet about ARIMA. Will watch some YouTube regarding ARIMA this weekend.

    But luckily RapidMiner offers an Optimization Parameters operator. ?

     

    So @tftemme this is the result:

    Oil Prediction.jpg

    And the model:

    OilPredictionModelARIMA.jpeg

     

    And the XML

    <?xml version="1.0" encoding="UTF-8"?><process version="7.6.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="7.6.001" expanded="true" name="Process">
    <parameter key="logverbosity" value="init"/>
    <parameter key="random_seed" value="2001"/>
    <parameter key="send_mail" value="never"/>
    <parameter key="notification_email" value=""/>
    <parameter key="process_duration_for_mail" value="30"/>
    <parameter key="encoding" value="SYSTEM"/>
    <process expanded="true">
    <operator activated="true" class="set_macro" compatibility="7.6.001" expanded="true" height="68" name="Optimization Cycles" width="90" x="782" y="289">
    <parameter key="macro" value="OptimizeCycles"/>
    <parameter key="value" value="50"/>
    </operator>
    <operator activated="true" class="generate_macro" compatibility="7.6.001" expanded="true" height="68" name="Current Date" width="90" x="782" y="85">
    <list key="function_descriptions">
    <parameter key="CurrentDate" value="date_now()"/>
    </list>
    </operator>
    <operator activated="true" class="set_macro" compatibility="7.6.001" expanded="true" height="68" name="Prediction Horizon" width="90" x="916" y="85">
    <parameter key="macro" value="PredictionHorizon"/>
    <parameter key="value" value="20"/>
    </operator>
    <operator activated="true" class="set_macro" compatibility="7.6.001" expanded="true" height="68" name="Training From Date" width="90" x="782" y="187">
    <parameter key="macro" value="AnalysesDateFrom"/>
    <parameter key="value" value="2016/02/11"/>
    </operator>
    <operator activated="true" class="set_macro" compatibility="7.6.001" expanded="true" height="68" name="Training To Date" width="90" x="916" y="187">
    <parameter key="macro" value="TrainingDateTo"/>
    <parameter key="value" value="%{CurrentDate}"/>
    </operator>
    <operator activated="true" class="subprocess" compatibility="7.6.001" expanded="true" height="82" name="Get/Join Data" width="90" x="112" y="85">
    <process expanded="true">
    <operator activated="true" class="subprocess" compatibility="7.6.001" expanded="true" height="82" name="Oil Futures" width="90" x="313" y="34">
    <process expanded="true">
    <operator activated="true" class="jdbc_connectors:read_database" compatibility="7.6.001" expanded="true" height="68" name="Read Database (2)" width="90" x="45" y="34">
    <parameter key="define_connection" value="predefined"/>
    <parameter key="connection" value="MySQL"/>
    <parameter key="database_system" value="MySQL"/>
    <parameter key="define_query" value="query"/>
    <parameter key="query" value="SELECT *&#10;FROM `oil`&#10;ORDER BY Date desc&#10;limit 9999"/>
    <parameter key="use_default_schema" value="true"/>
    <parameter key="prepare_statement" value="false"/>
    <enumeration key="parameters"/>
    <parameter key="datamanagement" value="double_array"/>
    <parameter key="data_management" value="auto"/>
    </operator>
    <operator activated="false" class="store" compatibility="7.6.001" expanded="true" height="68" name="Store (11)" width="90" x="45" y="136">
    <parameter key="repository_entry" value="//Cloud Repository/Samples/data/oilfuturesvw"/>
    </operator>
    <operator activated="false" class="retrieve" compatibility="7.6.001" expanded="true" height="68" name="Retrieve (2)" width="90" x="179" y="136">
    <parameter key="repository_entry" value="../data/oilfuturesvw"/>
    </operator>
    <operator activated="true" class="select_attributes" compatibility="7.6.001" expanded="true" height="82" name="Select Attributes (2)" width="90" x="514" y="34">
    <parameter key="attribute_filter_type" value="subset"/>
    <parameter key="attribute" value=""/>
    <parameter key="attributes" value="Volume|Settle|Previous Day Open Interest|Open|Low|Last|High|Date"/>
    <parameter key="use_except_expression" value="false"/>
    <parameter key="value_type" value="attribute_value"/>
    <parameter key="use_value_type_exception" value="false"/>
    <parameter key="except_value_type" value="time"/>
    <parameter key="block_type" value="attribute_block"/>
    <parameter key="use_block_type_exception" value="false"/>
    <parameter key="except_block_type" value="value_matrix_row_start"/>
    <parameter key="invert_selection" value="false"/>
    <parameter key="include_special_attributes" value="false"/>
    </operator>
    <operator activated="true" class="nominal_to_date" compatibility="7.6.001" expanded="true" height="82" name="Nominal to Date (8)" width="90" x="648" y="34">
    <parameter key="attribute_name" value="Date"/>
    <parameter key="date_type" value="date"/>
    <parameter key="date_format" value="yyyy-MM-dd"/>
    <parameter key="time_zone" value="SYSTEM"/>
    <parameter key="locale" value="English (United States)"/>
    <parameter key="keep_old_attribute" value="false"/>
    </operator>
    <operator activated="true" class="rename" compatibility="7.6.001" expanded="true" height="82" name="Rename (8)" width="90" x="782" y="34">
    <parameter key="old_name" value="Date"/>
    <parameter key="new_name" value="oilDate"/>
    <list key="rename_additional_attributes">
    <parameter key="High" value="oilHigh"/>
    <parameter key="Low" value="oilLow"/>
    <parameter key="Open" value="oilOpen"/>
    <parameter key="Previous Day Open Interest" value="oilPrevDayOpenInt"/>
    <parameter key="Settle" value="oilSettle"/>
    <parameter key="Volume" value="oilVolume"/>
    <parameter key="Last" value="oilLast"/>
    </list>
    </operator>
    <connect from_op="Read Database (2)" from_port="output" to_op="Select Attributes (2)" to_port="example set input"/>
    <connect from_op="Select Attributes (2)" from_port="example set output" to_op="Nominal to Date (8)" to_port="example set input"/>
    <connect from_op="Nominal to Date (8)" from_port="example set output" to_op="Rename (8)" to_port="example set input"/>
    <connect from_op="Rename (8)" from_port="example set output" to_port="out 1"/>
    <portSpacing port="source_in 1" spacing="0"/>
    <portSpacing port="sink_out 1" spacing="0"/>
    <portSpacing port="sink_out 2" spacing="0"/>
    </process>
    </operator>
    <connect from_op="Oil Futures" from_port="out 1" to_port="out 1"/>
    <portSpacing port="source_in 1" spacing="0"/>
    <portSpacing port="sink_out 1" spacing="0"/>
    <portSpacing port="sink_out 2" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="sort" compatibility="7.6.001" expanded="true" height="82" name="Sort" width="90" x="246" y="85">
    <parameter key="attribute_name" value="oilDate"/>
    <parameter key="sorting_direction" value="increasing"/>
    </operator>
    <operator activated="true" class="set_role" compatibility="7.6.001" expanded="true" height="82" name="Set Role" width="90" x="112" y="238">
    <parameter key="attribute_name" value="oilLast"/>
    <parameter key="target_role" value="label"/>
    <list key="set_additional_roles">
    <parameter key="oilDate" value="id"/>
    <parameter key="oilLast" value="regular"/>
    </list>
    </operator>
    <operator activated="true" class="select_attributes" compatibility="7.6.001" expanded="true" height="82" name="Select Attributes" width="90" x="246" y="238">
    <parameter key="attribute_filter_type" value="subset"/>
    <parameter key="attribute" value=""/>
    <parameter key="attributes" value="oilLast|oilHigh|oilLow|oilOpen|oilSettle|oilPrevDayOpenInt|oilVolume"/>
    <parameter key="use_except_expression" value="false"/>
    <parameter key="value_type" value="attribute_value"/>
    <parameter key="use_value_type_exception" value="false"/>
    <parameter key="except_value_type" value="time"/>
    <parameter key="block_type" value="attribute_block"/>
    <parameter key="use_block_type_exception" value="false"/>
    <parameter key="except_block_type" value="value_matrix_row_start"/>
    <parameter key="invert_selection" value="false"/>
    <parameter key="include_special_attributes" value="false"/>
    </operator>
    <operator activated="true" class="filter_examples" compatibility="7.6.001" expanded="true" height="103" name="Filter Start of Trend" width="90" x="380" y="238">
    <parameter key="parameter_expression" value="date_after(oilDate, date_parse_custom(%{AnalysesDateFrom}, &quot;yyyy/MM/dd&quot;))"/>
    <parameter key="condition_class" value="expression"/>
    <parameter key="invert_filter" value="false"/>
    <list key="filters_list"/>
    <parameter key="filters_logic_and" value="true"/>
    <parameter key="filters_check_metadata" value="true"/>
    </operator>
    <operator activated="true" class="filter_examples" compatibility="7.6.001" expanded="true" height="103" name="Train until Hold-off" width="90" x="514" y="238">
    <parameter key="parameter_expression" value="date_before(oilDate, date_now())"/>
    <parameter key="condition_class" value="expression"/>
    <parameter key="invert_filter" value="false"/>
    <list key="filters_list"/>
    <parameter key="filters_logic_and" value="true"/>
    <parameter key="filters_check_metadata" value="true"/>
    </operator>
    <operator activated="true" class="multiply" compatibility="7.6.001" expanded="true" height="124" name="Multiply (3)" width="90" x="112" y="544"/>
    <operator activated="true" class="subprocess" compatibility="7.6.001" expanded="true" height="103" name="ARIMA Predict Last" width="90" x="246" y="442">
    <process expanded="true">
    <operator activated="true" class="optimize_parameters_evolutionary" compatibility="7.6.001" expanded="true" height="145" name="Optimize Parameters (Evolutionary)" width="90" x="112" y="34">
    <list key="parameters">
    <parameter key="ARIMA Trainer.qlithiumorder_of_the_moving-average_model" value="[0.0;100.0]"/>
    <parameter key="ARIMA Trainer.plithiumorder_of_the_autoregressive_model" value="[0.0;100.0]"/>
    </list>
    <parameter key="error_handling" value="ignore error"/>
    <parameter key="max_generations" value="%{OptimizeCycles}"/>
    <parameter key="use_early_stopping" value="true"/>
    <parameter key="generations_without_improval" value="2"/>
    <parameter key="specify_population_size" value="true"/>
    <parameter key="population_size" value="5"/>
    <parameter key="keep_best" value="true"/>
    <parameter key="mutation_type" value="gaussian_mutation"/>
    <parameter key="selection_type" value="tournament"/>
    <parameter key="tournament_fraction" value="0.25"/>
    <parameter key="crossover_prob" value="0.9"/>
    <parameter key="use_local_random_seed" value="false"/>
    <parameter key="local_random_seed" value="1992"/>
    <parameter key="show_convergence_plot" value="false"/>
    <process expanded="true">
    <operator activated="true" class="timeseries:arima_trainer" compatibility="0.1.002" expanded="true" height="103" name="ARIMA Trainer" width="90" x="246" y="34">
    <parameter key="time_series_attribute" value="oilLast"/>
    <parameter key="has_indices" value="true"/>
    <parameter key="indices_attribute" value="oilDate"/>
    <parameter key="plithiumorder_of_the_autoregressive_model" value="39"/>
    <parameter key="dlithiumdegree_of_differencing" value="0"/>
    <parameter key="qlithiumorder_of_the_moving-average_model" value="94"/>
    <parameter key="estimate_constant" value="false"/>
    </operator>
    <operator activated="true" class="timeseries:apply_forecast" compatibility="0.1.002" expanded="true" height="82" name="Apply Forecast" width="90" x="380" y="34">
    <parameter key="forecast_horizon" value="%{PredictionHorizon}"/>
    <parameter key="forecast_only" value="false"/>
    <parameter key="add_combined_output" value="true"/>
    <description align="center" color="transparent" colored="false" width="126">Applying the ARIMA process to forecast the next 10 values of the time series</description>
    </operator>
    <connect from_port="input 1" to_op="ARIMA Trainer" to_port="example set"/>
    <connect from_op="ARIMA Trainer" from_port="forecast model" to_op="Apply Forecast" to_port="forecast model"/>
    <connect from_op="ARIMA Trainer" from_port="performance" to_port="performance"/>
    <connect from_op="Apply Forecast" from_port="example set" to_port="result 1"/>
    <connect from_op="Apply Forecast" from_port="original" to_port="result 2"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_performance" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    <portSpacing port="sink_result 3" spacing="0"/>
    </process>
    </operator>
    <operator activated="false" class="timeseries:arima_trainer" compatibility="0.1.002" expanded="true" height="103" name="ARIMA Trainer (6)" width="90" x="313" y="187">
    <parameter key="time_series_attribute" value="oilLast"/>
    <parameter key="has_indices" value="true"/>
    <parameter key="indices_attribute" value="oilDate"/>
    <parameter key="plithiumorder_of_the_autoregressive_model" value="2"/>
    <parameter key="dlithiumdegree_of_differencing" value="0"/>
    <parameter key="qlithiumorder_of_the_moving-average_model" value="92"/>
    <parameter key="estimate_constant" value="false"/>
    </operator>
    <operator activated="false" class="timeseries:apply_forecast" compatibility="0.1.002" expanded="true" height="82" name="Apply Forecast (6)" width="90" x="447" y="187">
    <parameter key="forecast_horizon" value="%{PredictionHorizon}"/>
    <parameter key="forecast_only" value="false"/>
    <parameter key="add_combined_output" value="true"/>
    <description align="center" color="transparent" colored="false" width="126">Applying the ARIMA process to forecast the next 10 values of the time series</description>
    </operator>
    <connect from_port="in 1" to_op="Optimize Parameters (Evolutionary)" to_port="input 1"/>
    <connect from_op="Optimize Parameters (Evolutionary)" from_port="performance" to_port="out 1"/>
    <connect from_op="Optimize Parameters (Evolutionary)" from_port="result 1" to_port="out 2"/>
    <connect from_op="ARIMA Trainer (6)" from_port="forecast model" to_op="Apply Forecast (6)" to_port="forecast model"/>
    <portSpacing port="source_in 1" spacing="0"/>
    <portSpacing port="source_in 2" spacing="0"/>
    <portSpacing port="sink_out 1" spacing="0"/>
    <portSpacing port="sink_out 2" spacing="0"/>
    <portSpacing port="sink_out 3" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="set_role" compatibility="7.6.001" expanded="true" height="82" name="Set Role (3)" width="90" x="447" y="442">
    <parameter key="attribute_name" value="forecast of oilLast"/>
    <parameter key="target_role" value="regular"/>
    <list key="set_additional_roles">
    <parameter key="oilLast and forecast" value="regular"/>
    </list>
    </operator>
    <operator activated="true" class="subprocess" compatibility="7.6.001" expanded="true" height="103" name="ARIMA Predict High" width="90" x="246" y="595">
    <process expanded="true">
    <operator activated="true" class="optimize_parameters_evolutionary" compatibility="7.6.001" expanded="true" height="145" name="Optimize Parameters (2)" width="90" x="112" y="34">
    <list key="parameters">
    <parameter key="ARIMA Trainer.qlithiumorder_of_the_moving-average_model" value="[0.0;100.0]"/>
    <parameter key="ARIMA Trainer.plithiumorder_of_the_autoregressive_model" value="[0.0;100.0]"/>
    </list>
    <parameter key="error_handling" value="ignore error"/>
    <parameter key="max_generations" value="%{OptimizeCycles}"/>
    <parameter key="use_early_stopping" value="true"/>
    <parameter key="generations_without_improval" value="2"/>
    <parameter key="specify_population_size" value="true"/>
    <parameter key="population_size" value="5"/>
    <parameter key="keep_best" value="true"/>
    <parameter key="mutation_type" value="gaussian_mutation"/>
    <parameter key="selection_type" value="tournament"/>
    <parameter key="tournament_fraction" value="0.25"/>
    <parameter key="crossover_prob" value="0.9"/>
    <parameter key="use_local_random_seed" value="false"/>
    <parameter key="local_random_seed" value="1992"/>
    <parameter key="show_convergence_plot" value="false"/>
    <process expanded="true">
    <operator activated="true" class="timeseries:arima_trainer" compatibility="0.1.002" expanded="true" height="103" name="ARIMA Trainer (4)" width="90" x="246" y="34">
    <parameter key="time_series_attribute" value="oilHigh"/>
    <parameter key="has_indices" value="true"/>
    <parameter key="indices_attribute" value="oilDate"/>
    <parameter key="plithiumorder_of_the_autoregressive_model" value="94"/>
    <parameter key="dlithiumdegree_of_differencing" value="0"/>
    <parameter key="qlithiumorder_of_the_moving-average_model" value="56"/>
    <parameter key="estimate_constant" value="false"/>
    </operator>
    <operator activated="true" class="timeseries:apply_forecast" compatibility="0.1.002" expanded="true" height="82" name="Apply Forecast (4)" width="90" x="380" y="34">
    <parameter key="forecast_horizon" value="%{PredictionHorizon}"/>
    <parameter key="forecast_only" value="false"/>
    <parameter key="add_combined_output" value="true"/>
    <description align="center" color="transparent" colored="false" width="126">Applying the ARIMA process to forecast the next 10 values of the time series</description>
    </operator>
    <connect from_port="input 1" to_op="ARIMA Trainer (4)" to_port="example set"/>
    <connect from_op="ARIMA Trainer (4)" from_port="forecast model" to_op="Apply Forecast (4)" to_port="forecast model"/>
    <connect from_op="ARIMA Trainer (4)" from_port="performance" to_port="performance"/>
    <connect from_op="Apply Forecast (4)" from_port="example set" to_port="result 1"/>
    <connect from_op="Apply Forecast (4)" from_port="original" to_port="result 2"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_performance" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    <portSpacing port="sink_result 3" spacing="0"/>
    </process>
    </operator>
    <operator activated="false" class="timeseries:arima_trainer" compatibility="0.1.002" expanded="true" height="103" name="ARIMA Trainer (7)" width="90" x="112" y="238">
    <parameter key="time_series_attribute" value="oilHigh"/>
    <parameter key="has_indices" value="true"/>
    <parameter key="indices_attribute" value="oilDate"/>
    <parameter key="plithiumorder_of_the_autoregressive_model" value="94"/>
    <parameter key="dlithiumdegree_of_differencing" value="0"/>
    <parameter key="qlithiumorder_of_the_moving-average_model" value="56"/>
    <parameter key="estimate_constant" value="false"/>
    </operator>
    <operator activated="false" class="timeseries:apply_forecast" compatibility="0.1.002" expanded="true" height="82" name="Apply Forecast (7)" width="90" x="246" y="238">
    <parameter key="forecast_horizon" value="%{PredictionHorizon}"/>
    <parameter key="forecast_only" value="false"/>
    <parameter key="add_combined_output" value="true"/>
    <description align="center" color="transparent" colored="false" width="126">Applying the ARIMA process to forecast the next 10 values of the time series</description>
    </operator>
    <connect from_port="in 1" to_op="Optimize Parameters (2)" to_port="input 1"/>
    <connect from_op="Optimize Parameters (2)" from_port="performance" to_port="out 1"/>
    <connect from_op="Optimize Parameters (2)" from_port="result 1" to_port="out 2"/>
    <connect from_op="ARIMA Trainer (7)" from_port="forecast model" to_op="Apply Forecast (7)" to_port="forecast model"/>
    <portSpacing port="source_in 1" spacing="0"/>
    <portSpacing port="source_in 2" spacing="0"/>
    <portSpacing port="sink_out 1" spacing="0"/>
    <portSpacing port="sink_out 2" spacing="0"/>
    <portSpacing port="sink_out 3" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="set_role" compatibility="7.6.001" expanded="true" height="82" name="Set Role (4)" width="90" x="447" y="595">
    <parameter key="attribute_name" value="forecast of oilHigh"/>
    <parameter key="target_role" value="regular"/>
    <list key="set_additional_roles">
    <parameter key="oilHigh and forecast" value="regular"/>
    </list>
    </operator>
    <operator activated="true" class="subprocess" compatibility="7.6.001" expanded="true" height="103" name="ARIMA Predict Low" width="90" x="246" y="748">
    <process expanded="true">
    <operator activated="true" class="optimize_parameters_evolutionary" compatibility="7.6.001" expanded="true" height="124" name="Optimize Parameters (3)" width="90" x="112" y="34">
    <list key="parameters">
    <parameter key="ARIMA Trainer.qlithiumorder_of_the_moving-average_model" value="[0.0;100.0]"/>
    <parameter key="ARIMA Trainer.plithiumorder_of_the_autoregressive_model" value="[0.0;100.0]"/>
    </list>
    <parameter key="error_handling" value="ignore error"/>
    <parameter key="max_generations" value="%{OptimizeCycles}"/>
    <parameter key="use_early_stopping" value="true"/>
    <parameter key="generations_without_improval" value="2"/>
    <parameter key="specify_population_size" value="true"/>
    <parameter key="population_size" value="5"/>
    <parameter key="keep_best" value="true"/>
    <parameter key="mutation_type" value="gaussian_mutation"/>
    <parameter key="selection_type" value="tournament"/>
    <parameter key="tournament_fraction" value="0.25"/>
    <parameter key="crossover_prob" value="0.9"/>
    <parameter key="use_local_random_seed" value="false"/>
    <parameter key="local_random_seed" value="1992"/>
    <parameter key="show_convergence_plot" value="false"/>
    <process expanded="true">
    <operator activated="true" class="timeseries:arima_trainer" compatibility="0.1.002" expanded="true" height="103" name="ARIMA Trainer (5)" width="90" x="112" y="85">
    <parameter key="time_series_attribute" value="oilLow"/>
    <parameter key="has_indices" value="true"/>
    <parameter key="indices_attribute" value="oilDate"/>
    <parameter key="plithiumorder_of_the_autoregressive_model" value="94"/>
    <parameter key="dlithiumdegree_of_differencing" value="0"/>
    <parameter key="qlithiumorder_of_the_moving-average_model" value="56"/>
    <parameter key="estimate_constant" value="false"/>
    </operator>
    <operator activated="true" class="timeseries:apply_forecast" compatibility="0.1.002" expanded="true" height="82" name="Apply Forecast (5)" width="90" x="380" y="238">
    <parameter key="forecast_horizon" value="%{PredictionHorizon}"/>
    <parameter key="forecast_only" value="false"/>
    <parameter key="add_combined_output" value="true"/>
    <description align="center" color="transparent" colored="false" width="126">Applying the ARIMA process to forecast the next 10 values of the time series</description>
    </operator>
    <connect from_port="input 1" to_op="ARIMA Trainer (5)" to_port="example set"/>
    <connect from_op="ARIMA Trainer (5)" from_port="forecast model" to_op="Apply Forecast (5)" to_port="forecast model"/>
    <connect from_op="ARIMA Trainer (5)" from_port="performance" to_port="performance"/>
    <connect from_op="Apply Forecast (5)" from_port="example set" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_performance" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    <operator activated="false" class="timeseries:arima_trainer" compatibility="0.1.002" expanded="true" height="103" name="ARIMA Trainer (2)" width="90" x="112" y="238">
    <parameter key="time_series_attribute" value="oilLow"/>
    <parameter key="has_indices" value="true"/>
    <parameter key="indices_attribute" value="oilDate"/>
    <parameter key="plithiumorder_of_the_autoregressive_model" value="94"/>
    <parameter key="dlithiumdegree_of_differencing" value="0"/>
    <parameter key="qlithiumorder_of_the_moving-average_model" value="56"/>
    <parameter key="estimate_constant" value="false"/>
    </operator>
    <operator activated="false" class="timeseries:apply_forecast" compatibility="0.1.002" expanded="true" height="82" name="Apply Forecast (2)" width="90" x="246" y="238">
    <parameter key="forecast_horizon" value="%{PredictionHorizon}"/>
    <parameter key="forecast_only" value="false"/>
    <parameter key="add_combined_output" value="true"/>
    <description align="center" color="transparent" colored="false" width="126">Applying the ARIMA process to forecast the next 10 values of the time series</description>
    </operator>
    <connect from_port="in 1" to_op="Optimize Parameters (3)" to_port="input 1"/>
    <connect from_op="Optimize Parameters (3)" from_port="performance" to_port="out 1"/>
    <connect from_op="Optimize Parameters (3)" from_port="result 1" to_port="out 2"/>
    <connect from_op="ARIMA Trainer (2)" from_port="forecast model" to_op="Apply Forecast (2)" to_port="forecast model"/>
    <portSpacing port="source_in 1" spacing="0"/>
    <portSpacing port="source_in 2" spacing="0"/>
    <portSpacing port="sink_out 1" spacing="0"/>
    <portSpacing port="sink_out 2" spacing="0"/>
    <portSpacing port="sink_out 3" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="set_role" compatibility="7.6.001" expanded="true" height="82" name="Set Role (5)" width="90" x="447" y="748">
    <parameter key="attribute_name" value="forecast of oilLow"/>
    <parameter key="target_role" value="regular"/>
    <list key="set_additional_roles">
    <parameter key="oilLow and forecast" value="regular"/>
    </list>
    </operator>
    <operator activated="true" class="filter_examples" compatibility="7.6.001" expanded="true" height="103" name="Filter Graph Last" width="90" x="581" y="442">
    <parameter key="parameter_expression" value="date_after(oilDate, date_set(date_now(), -eval(%{PredictionHorizon})-1, DATE_UNIT_DAY))"/>
    <parameter key="condition_class" value="expression"/>
    <parameter key="invert_filter" value="false"/>
    <list key="filters_list"/>
    <parameter key="filters_logic_and" value="true"/>
    <parameter key="filters_check_metadata" value="true"/>
    </operator>
    <operator activated="true" class="filter_examples" compatibility="7.6.001" expanded="true" height="103" name="Filter Graph High" width="90" x="581" y="595">
    <parameter key="parameter_expression" value="date_after(oilDate, date_set(date_now(), -eval(%{PredictionHorizon})-1, DATE_UNIT_DAY))"/>
    <parameter key="condition_class" value="expression"/>
    <parameter key="invert_filter" value="false"/>
    <list key="filters_list"/>
    <parameter key="filters_logic_and" value="true"/>
    <parameter key="filters_check_metadata" value="true"/>
    </operator>
    <operator activated="true" class="filter_examples" compatibility="7.6.001" expanded="true" height="103" name="Filter Graph Low" width="90" x="581" y="748">
    <parameter key="parameter_expression" value="date_after(oilDate, date_set(date_now(), -eval(%{PredictionHorizon})-1, DATE_UNIT_DAY))"/>
    <parameter key="condition_class" value="expression"/>
    <parameter key="invert_filter" value="false"/>
    <list key="filters_list"/>
    <parameter key="filters_logic_and" value="true"/>
    <parameter key="filters_check_metadata" value="true"/>
    </operator>
    <operator activated="true" class="join" compatibility="7.6.001" expanded="true" height="82" name="Join" width="90" x="715" y="493">
    <parameter key="remove_double_attributes" value="true"/>
    <parameter key="join_type" value="inner"/>
    <parameter key="use_id_attribute_as_key" value="false"/>
    <list key="key_attributes">
    <parameter key="oilDate" value="oilDate"/>
    </list>
    <parameter key="keep_both_join_attributes" value="false"/>
    </operator>
    <operator activated="true" class="join" compatibility="7.6.001" expanded="true" height="82" name="Join (2)" width="90" x="715" y="595">
    <parameter key="remove_double_attributes" value="true"/>
    <parameter key="join_type" value="inner"/>
    <parameter key="use_id_attribute_as_key" value="false"/>
    <list key="key_attributes">
    <parameter key="oilDate" value="oilDate"/>
    </list>
    <parameter key="keep_both_join_attributes" value="false"/>
    </operator>
    <operator activated="true" class="select_attributes" compatibility="7.6.001" expanded="true" height="82" name="Oil Forecast" width="90" x="849" y="493">
    <parameter key="attribute_filter_type" value="subset"/>
    <parameter key="attribute" value=""/>
    <parameter key="attributes" value="oilLow|oilLast|oilHigh|oilDate|forecast of oilLow|forecast of oilLast|forecast of oilHigh"/>
    <parameter key="use_except_expression" value="false"/>
    <parameter key="value_type" value="attribute_value"/>
    <parameter key="use_value_type_exception" value="false"/>
    <parameter key="except_value_type" value="time"/>
    <parameter key="block_type" value="attribute_block"/>
    <parameter key="use_block_type_exception" value="false"/>
    <parameter key="except_block_type" value="value_matrix_row_start"/>
    <parameter key="invert_selection" value="false"/>
    <parameter key="include_special_attributes" value="false"/>
    </operator>
    <operator activated="false" class="reporting:generate_report" compatibility="5.3.000" expanded="true" height="82" name="Generate Report" width="90" x="849" y="595">
    <parameter key="report_name" value="Oil Prediction"/>
    <parameter key="format" value="HTML"/>
    <parameter key="report_to_repository" value="false"/>
    <parameter key="html_output_directory" value="/Users/Luc/Dropbox/RapidMiner Prediction Reports"/>
    <parameter key="pdf_output_file" value="/Users/Luc/Dropbox/OilPrediction.pdf"/>
    <parameter key="html_logo_file" value="/Users/Luc/Dropbox/RapidMiner Prediction Reports/logo.png"/>
    <parameter key="html_image_format" value="png"/>
    <parameter key="image_col_span" value="8"/>
    <parameter key="image_row_span" value="17"/>
    <parameter key="page_size" value="0"/>
    <parameter key="page_format" value="0"/>
    <parameter key="template_type" value="0"/>
    <parameter key="pdf_template_file" value="/no file selected"/>
    <parameter key="image_template_file" value="/no file selected"/>
    <parameter key="image_alignment" value="0"/>
    <parameter key="set_background_color" value="true"/>
    <parameter key="background_color" value="255,255,255"/>
    <parameter key="page_width" value="595"/>
    <parameter key="page_height" value="842"/>
    <parameter key="top_page_margin" value="36"/>
    <parameter key="bottom_page_margin" value="36"/>
    <parameter key="left_page_margin" value="36"/>
    <parameter key="right_page_margin" value="36"/>
    <parameter key="section_one_font" value="courier"/>
    <parameter key="section_one_font_size" value="12.0"/>
    <parameter key="section_one_font_style_bold" value="false"/>
    <parameter key="section_one_font_style_italic" value="false"/>
    <parameter key="section_one_font_style_underline" value="false"/>
    <parameter key="section_one_font_style_strikethrough" value="false"/>
    <parameter key="section_one_font_color" value="0,0,0"/>
    <parameter key="section_two_font" value="courier"/>
    <parameter key="section_two_font_size" value="12.0"/>
    <parameter key="section_two_font_style_bold" value="false"/>
    <parameter key="section_two_font_style_italic" value="false"/>
    <parameter key="section_two_font_style_underline" value="false"/>
    <parameter key="section_two_font_style_strikethrough" value="false"/>
    <parameter key="section_two_font_color" value="0,0,0"/>
    <parameter key="section_three_font" value="courier"/>
    <parameter key="section_three_font_size" value="12.0"/>
    <parameter key="section_three_font_style_bold" value="false"/>
    <parameter key="section_three_font_style_italic" value="false"/>
    <parameter key="section_three_font_style_underline" value="false"/>
    <parameter key="section_three_font_style_strikethrough" value="false"/>
    <parameter key="section_three_font_color" value="0,0,0"/>
    <parameter key="section_four_font" value="courier"/>
    <parameter key="section_four_font_size" value="12.0"/>
    <parameter key="section_four_font_style_bold" value="false"/>
    <parameter key="section_four_font_style_italic" value="false"/>
    <parameter key="section_four_font_style_underline" value="false"/>
    <parameter key="section_four_font_style_strikethrough" value="false"/>
    <parameter key="section_four_font_color" value="0,0,0"/>
    <parameter key="section_five_font" value="courier"/>
    <parameter key="section_five_font_size" value="12.0"/>
    <parameter key="section_five_font_style_bold" value="false"/>
    <parameter key="section_five_font_style_italic" value="false"/>
    <parameter key="section_five_font_style_underline" value="false"/>
    <parameter key="section_five_font_style_strikethrough" value="false"/>
    <parameter key="section_five_font_color" value="0,0,0"/>
    <parameter key="text_content_font" value="courier"/>
    <parameter key="text_content_font_size" value="12.0"/>
    <parameter key="text_content_font_style_bold" value="false"/>
    <parameter key="text_content_font_style_italic" value="false"/>
    <parameter key="text_content_font_style_underline" value="false"/>
    <parameter key="text_content_font_style_strikethrough" value="false"/>
    <parameter key="text_content_font_color" value="0,0,0"/>
    <parameter key="system_fonts" value="false"/>
    <parameter key="directory_fonts" value="false"/>
    <parameter key="table_column_number" value="16"/>
    <parameter key="table_header_color" value="128,128,128"/>
    <parameter key="table_row_color_one" value="255,255,255"/>
    <parameter key="table_row_color_two" value="192,192,192"/>
    </operator>
    <operator activated="false" class="reporting:report" compatibility="5.3.000" expanded="true" height="68" name="Report" width="90" x="849" y="697">
    <parameter key="report_name" value="Oil Prediction"/>
    <parameter key="report_item_header" value="%{CurrentDate}"/>
    <parameter key="specified" value="true"/>
    <parameter key="reportable_type" value="Data Table"/>
    <parameter key="renderer_name" value="Advanced Charts"/>
    <list key="parameters"/>
    <parameter key="image_width" value="800"/>
    <parameter key="image_height" value="600"/>
    </operator>
    <connect from_op="Get/Join Data" from_port="out 1" to_op="Sort" to_port="example set input"/>
    <connect from_op="Sort" from_port="example set output" to_op="Set Role" to_port="example set input"/>
    <connect from_op="Set Role" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
    <connect from_op="Select Attributes" from_port="example set output" to_op="Filter Start of Trend" to_port="example set input"/>
    <connect from_op="Filter Start of Trend" from_port="example set output" to_op="Train until Hold-off" to_port="example set input"/>
    <connect from_op="Train until Hold-off" from_port="example set output" to_op="Multiply (3)" to_port="input"/>
    <connect from_op="Multiply (3)" from_port="output 1" to_op="ARIMA Predict Last" to_port="in 1"/>
    <connect from_op="Multiply (3)" from_port="output 2" to_op="ARIMA Predict High" to_port="in 1"/>
    <connect from_op="Multiply (3)" from_port="output 3" to_op="ARIMA Predict Low" to_port="in 1"/>
    <connect from_op="ARIMA Predict Last" from_port="out 1" to_port="result 3"/>
    <connect from_op="ARIMA Predict Last" from_port="out 2" to_op="Set Role (3)" to_port="example set input"/>
    <connect from_op="Set Role (3)" from_port="example set output" to_op="Filter Graph Last" to_port="example set input"/>
    <connect from_op="ARIMA Predict High" from_port="out 1" to_port="result 1"/>
    <connect from_op="ARIMA Predict High" from_port="out 2" to_op="Set Role (4)" to_port="example set input"/>
    <connect from_op="Set Role (4)" from_port="example set output" to_op="Filter Graph High" to_port="example set input"/>
    <connect from_op="ARIMA Predict Low" from_port="out 1" to_port="result 2"/>
    <connect from_op="ARIMA Predict Low" from_port="out 2" to_op="Set Role (5)" to_port="example set input"/>
    <connect from_op="Set Role (5)" from_port="example set output" to_op="Filter Graph Low" to_port="example set input"/>
    <connect from_op="Filter Graph Last" from_port="example set output" to_op="Join" to_port="left"/>
    <connect from_op="Filter Graph High" from_port="example set output" to_op="Join" to_port="right"/>
    <connect from_op="Filter Graph Low" from_port="example set output" to_op="Join (2)" to_port="right"/>
    <connect from_op="Join" from_port="join" to_op="Join (2)" to_port="left"/>
    <connect from_op="Join (2)" from_port="join" to_op="Oil Forecast" to_port="example set input"/>
    <connect from_op="Oil Forecast" from_port="example set output" to_port="result 4"/>
    <connect from_op="Generate Report" from_port="through 1" to_op="Report" to_port="reportable in"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    <portSpacing port="sink_result 3" spacing="0"/>
    <portSpacing port="sink_result 4" spacing="0"/>
    <portSpacing port="sink_result 5" spacing="0"/>
    <description align="center" color="yellow" colored="true" height="352" resized="true" width="334" x="701" y="22">Process Configuration (training example set, horizon, cycles ARIMA optimization, prediction date)</description>
    <description align="center" color="green" colored="true" height="166" resized="true" width="558" x="83" y="212">Select Time Series Scope</description>
    <description align="center" color="gray" colored="true" height="141" resized="true" width="551" x="84" y="49">Get source data</description>
    <description align="center" color="orange" colored="true" height="481" resized="true" width="278" x="81" y="395">Generate Future Predictions</description>
    <description align="center" color="blue" colored="true" height="481" resized="true" width="611" x="422" y="397">Reporting</description>
    </process>
    </operator>
    </process>

    Really love RapidMiner.

    Have a nice weekend.

    Greetings,

    Luc

     

Answers

  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    Hello @luc_bartkowski - thanks for this.  I agree that this is a very frequent use case and also agree that it could be easier.  A quick spoiler is that the Time Series Extension is undergoing a complete rebuild (see blog post from 2 weeks ago by @tftemme).  That said, I think we can help here consolidate these threads and maybe turn this into a sample for the new extension?  :)  If so could you please post (repost?) that data set and we will work on this together.

     

    As for the Yahoo Historical Data issue, yes we have talked about this a lot in this forum.  Numerous people have posted alternative solutions (see my KB article about Alpha Venture or posts about using Quandl).  Meanwhile we are working on pushing out a more permanent, better solution.


    Scott

  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn

    Personally @sgenzer I am very much looking forward to the rebuilding of the time series extension and the addition of new operators to make things easier, or to fill in gaps in the current offering (R package "forecast", anyone?).

     

    But in the meantime @luc_bartkowski you may find that there is another sample process, which is heavily annotated, that might help you along your way.  If you install the series extension, then when you open the "File>New Process" window of RapidMiner, you will be prompted with a series forecasting template, shown here (just scroll down until you see it).  I think you will find it helpful.time series sample.PNG

     

     

     

     

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • luc_bartkowskiluc_bartkowski Member Posts: 46 Maven

    Dear @Telcontar120,

    Thank you for your answer but the "Time Series Forecasting" template doesn't predict beyond the dates of the example set either.

    Greetings,

    Luc

  • luc_bartkowskiluc_bartkowski Member Posts: 46 Maven

    Hello @sgenzer / Scott,

     

    I've managed to reverse engineer the "loop" solution of @Thomas_Ott and build it into my own Times Series Prediction process.

    I am "close", but stil "no cigar". ? See the following pictures and the attached XML. The first picture shows my "standard" Time Series Forecasting train/validate/test process. The second picture zooms in on the Loop subprocess.

     

    These processes are based on the Quandl CME_CL1 Crude Oil Futures Continuous Contract 1 CL1 Front Month dataset.

    Please note that I added "oil" in front of every attribute name. So attribute Open of this dataset has been renamed oilOpen.

    The same for all other attributes: oilDate, oilHigh, oilLow, oilLast, etc. 

     

    The Loop subprocess generates an amount of future dates following the last date of the Test example set. The amount is equivalent to the horizon. But for some reason the Loop subprocess doesn't generate a new prediction(label) for every new (future) date. It copies the prediction(label) from the Remember/Recall operators (the last row of the Test example set) and adds this (as a constant) value to every new future date.

    It is my understanding that Thomas' Loop subprocess implementation generates a new prediction(label) using the model and puts its value in the attribute "Close". To my opinion the attribute Close doesn't exists in Thomas' Loop subprocess, it should be Close-0 to my humble opinion. So I don't know if this example process that I reused in my process is functioning properly either.

    Any help to get rid of this last flaw in my process model is appriciated.

     

    Thanks for the support.

    Luc

     

    oilTimeSeriesPrediction.jpeg

     

    oilTimeSeriesPrediction2.jpeg

     

      

    <?xml version="1.0" encoding="UTF-8"?><process version="7.6.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="7.6.001" expanded="true" name="Process">
    <parameter key="logverbosity" value="init"/>
    <parameter key="random_seed" value="2001"/>
    <parameter key="send_mail" value="never"/>
    <parameter key="notification_email" value=""/>
    <parameter key="process_duration_for_mail" value="30"/>
    <parameter key="encoding" value="SYSTEM"/>
    <process expanded="true">
    <operator activated="true" class="set_macro" compatibility="7.6.001" expanded="true" height="68" name="Training From Date" width="90" x="581" y="85">
    <parameter key="macro" value="AnalysesDateFrom"/>
    <parameter key="value" value="2016/02/11"/>
    </operator>
    <operator activated="true" class="set_macro" compatibility="7.6.001" expanded="true" height="68" name="Training To Date" width="90" x="715" y="85">
    <parameter key="macro" value="TrainingDateTo"/>
    <parameter key="value" value="2017/09/10"/>
    </operator>
    <operator activated="true" class="set_macro" compatibility="7.6.001" expanded="true" height="68" name="Prediction Horizon" width="90" x="849" y="85">
    <parameter key="macro" value="PredictionHorizon"/>
    <parameter key="value" value="10"/>
    </operator>
    <operator activated="true" class="subprocess" compatibility="7.6.001" expanded="true" height="82" name="Get/Join Data" width="90" x="112" y="85">
    <process expanded="true">
    <operator activated="true" class="subprocess" compatibility="7.6.001" expanded="true" height="82" name="Oil Futures" width="90" x="313" y="34">
    <process expanded="true">
    <operator activated="false" class="jdbc_connectors:read_database" compatibility="7.6.001" expanded="true" height="68" name="Read Database (2)" width="90" x="45" y="34">
    <parameter key="define_connection" value="predefined"/>
    <parameter key="connection" value="MySQL"/>
    <parameter key="database_system" value="MySQL"/>
    <parameter key="define_query" value="query"/>
    <parameter key="query" value="SELECT *&#10;FROM `oil`&#10;ORDER BY Date desc&#10;limit 9999"/>
    <parameter key="use_default_schema" value="true"/>
    <parameter key="prepare_statement" value="false"/>
    <enumeration key="parameters"/>
    <parameter key="datamanagement" value="double_array"/>
    <parameter key="data_management" value="auto"/>
    </operator>
    <operator activated="false" class="store" compatibility="7.6.001" expanded="true" height="68" name="Store (11)" width="90" x="179" y="34">
    <parameter key="repository_entry" value="//Cloud Repository/Samples/data/oilfuturesvw"/>
    </operator>
    <operator activated="true" class="retrieve" compatibility="7.6.001" expanded="true" height="68" name="Retrieve (2)" width="90" x="45" y="136">
    <parameter key="repository_entry" value="../data/oilfuturesvw"/>
    </operator>
    <operator activated="true" class="select_attributes" compatibility="7.6.001" expanded="true" height="82" name="Select Attributes (2)" width="90" x="514" y="34">
    <parameter key="attribute_filter_type" value="subset"/>
    <parameter key="attribute" value=""/>
    <parameter key="attributes" value="Volume|Settle|Previous Day Open Interest|Open|Low|Last|High|Date"/>
    <parameter key="use_except_expression" value="false"/>
    <parameter key="value_type" value="attribute_value"/>
    <parameter key="use_value_type_exception" value="false"/>
    <parameter key="except_value_type" value="time"/>
    <parameter key="block_type" value="attribute_block"/>
    <parameter key="use_block_type_exception" value="false"/>
    <parameter key="except_block_type" value="value_matrix_row_start"/>
    <parameter key="invert_selection" value="false"/>
    <parameter key="include_special_attributes" value="false"/>
    </operator>
    <operator activated="true" class="nominal_to_date" compatibility="7.6.001" expanded="true" height="82" name="Nominal to Date (8)" width="90" x="648" y="34">
    <parameter key="attribute_name" value="Date"/>
    <parameter key="date_type" value="date"/>
    <parameter key="date_format" value="yyyy-MM-dd"/>
    <parameter key="time_zone" value="SYSTEM"/>
    <parameter key="locale" value="English (United States)"/>
    <parameter key="keep_old_attribute" value="false"/>
    </operator>
    <operator activated="true" class="rename" compatibility="7.6.001" expanded="true" height="82" name="Rename (8)" width="90" x="782" y="34">
    <parameter key="old_name" value="Date"/>
    <parameter key="new_name" value="oilDate"/>
    <list key="rename_additional_attributes">
    <parameter key="High" value="oilHigh"/>
    <parameter key="Low" value="oilLow"/>
    <parameter key="Open" value="oilOpen"/>
    <parameter key="Previous Day Open Interest" value="oilPrevDayOpenInt"/>
    <parameter key="Settle" value="oilSettle"/>
    <parameter key="Volume" value="oilVolume"/>
    <parameter key="Last" value="oilLast"/>
    </list>
    </operator>
    <connect from_op="Read Database (2)" from_port="output" to_op="Store (11)" to_port="input"/>
    <connect from_op="Retrieve (2)" from_port="output" to_op="Select Attributes (2)" to_port="example set input"/>
    <connect from_op="Select Attributes (2)" from_port="example set output" to_op="Nominal to Date (8)" to_port="example set input"/>
    <connect from_op="Nominal to Date (8)" from_port="example set output" to_op="Rename (8)" to_port="example set input"/>
    <connect from_op="Rename (8)" from_port="example set output" to_port="out 1"/>
    <portSpacing port="source_in 1" spacing="0"/>
    <portSpacing port="sink_out 1" spacing="0"/>
    <portSpacing port="sink_out 2" spacing="0"/>
    </process>
    </operator>
    <connect from_op="Oil Futures" from_port="out 1" to_port="out 1"/>
    <portSpacing port="source_in 1" spacing="0"/>
    <portSpacing port="sink_out 1" spacing="0"/>
    <portSpacing port="sink_out 2" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="sort" compatibility="7.6.001" expanded="true" height="82" name="Sort" width="90" x="246" y="85">
    <parameter key="attribute_name" value="oilDate"/>
    <parameter key="sorting_direction" value="increasing"/>
    </operator>
    <operator activated="true" class="multiply" compatibility="7.6.001" expanded="true" height="103" name="Multiply" width="90" x="45" y="340"/>
    <operator activated="true" class="set_role" compatibility="7.6.001" expanded="true" height="82" name="Set Role" width="90" x="179" y="238">
    <parameter key="attribute_name" value="oilLast"/>
    <parameter key="target_role" value="label"/>
    <list key="set_additional_roles">
    <parameter key="oilDate" value="id"/>
    <parameter key="oilLast" value="regular"/>
    </list>
    </operator>
    <operator activated="true" class="select_attributes" compatibility="7.6.001" expanded="true" height="82" name="Select Attributes" width="90" x="313" y="238">
    <parameter key="attribute_filter_type" value="subset"/>
    <parameter key="attribute" value=""/>
    <parameter key="attributes" value="oilLast|oilHigh|oilLow|oilOpen|oilSettle|oilPrevDayOpenInt|oilVolume"/>
    <parameter key="use_except_expression" value="false"/>
    <parameter key="value_type" value="attribute_value"/>
    <parameter key="use_value_type_exception" value="false"/>
    <parameter key="except_value_type" value="time"/>
    <parameter key="block_type" value="attribute_block"/>
    <parameter key="use_block_type_exception" value="false"/>
    <parameter key="except_block_type" value="value_matrix_row_start"/>
    <parameter key="invert_selection" value="false"/>
    <parameter key="include_special_attributes" value="false"/>
    </operator>
    <operator activated="true" class="filter_examples" compatibility="7.6.001" expanded="true" height="103" name="Filter Start of Trend" width="90" x="447" y="238">
    <parameter key="parameter_expression" value="date_after(oilDate, date_parse_custom(%{AnalysesDateFrom}, &quot;yyyy/MM/dd&quot;))"/>
    <parameter key="condition_class" value="expression"/>
    <parameter key="invert_filter" value="false"/>
    <list key="filters_list"/>
    <parameter key="filters_logic_and" value="true"/>
    <parameter key="filters_check_metadata" value="true"/>
    </operator>
    <operator activated="true" class="filter_examples" compatibility="7.6.001" expanded="true" height="103" name="Train until Hold-off" width="90" x="581" y="238">
    <parameter key="parameter_expression" value="date_before(oilDate, date_parse_custom(%{TrainingDateTo}, &quot;yyyy/MM/dd&quot;))"/>
    <parameter key="condition_class" value="expression"/>
    <parameter key="invert_filter" value="false"/>
    <list key="filters_list"/>
    <parameter key="filters_logic_and" value="true"/>
    <parameter key="filters_check_metadata" value="true"/>
    </operator>
    <operator activated="true" class="series:windowing" compatibility="7.4.000" expanded="true" height="82" name="Windowing" width="90" x="715" y="238">
    <parameter key="series_representation" value="encode_series_by_examples"/>
    <parameter key="window_size" value="1"/>
    <parameter key="step_size" value="1"/>
    <parameter key="create_single_attributes" value="true"/>
    <parameter key="create_label" value="true"/>
    <parameter key="select_label_by_dimension" value="false"/>
    <parameter key="label_attribute" value="oilLast"/>
    <parameter key="horizon" value="%{PredictionHorizon}"/>
    <parameter key="add_incomplete_windows" value="true"/>
    <parameter key="stop_on_too_small_dataset" value="true"/>
    </operator>
    <operator activated="true" class="series:sliding_window_validation" compatibility="7.4.000" expanded="true" height="124" name="Validation" width="90" x="849" y="238">
    <parameter key="create_complete_model" value="false"/>
    <parameter key="training_window_width" value="100"/>
    <parameter key="training_window_step_size" value="1"/>
    <parameter key="test_window_width" value="100"/>
    <parameter key="horizon" value="%{PredictionHorizon}"/>
    <parameter key="cumulative_training" value="true"/>
    <parameter key="average_performances_only" value="true"/>
    <process expanded="true">
    <operator activated="true" class="support_vector_machine" compatibility="7.6.001" expanded="true" height="124" name="SVM" width="90" x="185" y="34">
    <parameter key="kernel_type" value="dot"/>
    <parameter key="kernel_gamma" value="1.0"/>
    <parameter key="kernel_sigma1" value="1.0"/>
    <parameter key="kernel_sigma2" value="0.0"/>
    <parameter key="kernel_sigma3" value="2.0"/>
    <parameter key="kernel_shift" value="1.0"/>
    <parameter key="kernel_degree" value="2.0"/>
    <parameter key="kernel_a" value="1.0"/>
    <parameter key="kernel_b" value="0.0"/>
    <parameter key="kernel_cache" value="200"/>
    <parameter key="C" value="0.0"/>
    <parameter key="convergence_epsilon" value="0.001"/>
    <parameter key="max_iterations" value="100000"/>
    <parameter key="scale" value="true"/>
    <parameter key="calculate_weights" value="true"/>
    <parameter key="return_optimization_performance" value="true"/>
    <parameter key="L_pos" value="1.0"/>
    <parameter key="L_neg" value="1.0"/>
    <parameter key="epsilon" value="0.0"/>
    <parameter key="epsilon_plus" value="0.0"/>
    <parameter key="epsilon_minus" value="0.0"/>
    <parameter key="balance_cost" value="false"/>
    <parameter key="quadratic_loss_pos" value="false"/>
    <parameter key="quadratic_loss_neg" value="false"/>
    <parameter key="estimate_performance" value="false"/>
    </operator>
    <connect from_port="training" to_op="SVM" to_port="training set"/>
    <connect from_op="SVM" from_port="model" to_port="model"/>
    <portSpacing port="source_training" spacing="0"/>
    <portSpacing port="sink_model" spacing="0"/>
    <portSpacing port="sink_through 1" spacing="0"/>
    </process>
    <process expanded="true">
    <operator activated="true" class="apply_model" compatibility="7.6.001" expanded="true" height="82" name="Apply Model" width="90" x="45" y="34">
    <list key="application_parameters"/>
    <parameter key="create_view" value="false"/>
    </operator>
    <operator activated="true" class="series:forecasting_performance" compatibility="7.4.000" expanded="true" height="82" name="Performance" width="90" x="179" y="34">
    <parameter key="horizon" value="%{PredictionHorizon}"/>
    <parameter key="main_criterion" value="prediction_trend_accuracy"/>
    <parameter key="prediction_trend_accuracy" value="true"/>
    <parameter key="skip_undefined_labels" value="true"/>
    <parameter key="use_example_weights" value="false"/>
    </operator>
    <connect from_port="model" to_op="Apply Model" to_port="model"/>
    <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
    <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
    <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
    <portSpacing port="source_model" spacing="0"/>
    <portSpacing port="source_test set" spacing="0"/>
    <portSpacing port="source_through 1" spacing="0"/>
    <portSpacing port="sink_averagable 1" spacing="0"/>
    <portSpacing port="sink_averagable 2" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="set_role" compatibility="7.6.001" expanded="true" height="82" name="Set Role (2)" width="90" x="179" y="442">
    <parameter key="attribute_name" value="oilLast"/>
    <parameter key="target_role" value="prediction"/>
    <list key="set_additional_roles">
    <parameter key="oilDate" value="id"/>
    <parameter key="oilLast" value="regular"/>
    </list>
    </operator>
    <operator activated="true" class="select_attributes" compatibility="7.6.001" expanded="true" height="82" name="Select Attributes (4)" width="90" x="313" y="442">
    <parameter key="attribute_filter_type" value="subset"/>
    <parameter key="attribute" value=""/>
    <parameter key="attributes" value="oilHigh|oilLow|oilOpen|oilPrevDayOpenInt|oilSettle|oilVolume|oilLast"/>
    <parameter key="use_except_expression" value="false"/>
    <parameter key="value_type" value="attribute_value"/>
    <parameter key="use_value_type_exception" value="false"/>
    <parameter key="except_value_type" value="time"/>
    <parameter key="block_type" value="attribute_block"/>
    <parameter key="use_block_type_exception" value="false"/>
    <parameter key="except_block_type" value="value_matrix_row_start"/>
    <parameter key="invert_selection" value="false"/>
    <parameter key="include_special_attributes" value="false"/>
    </operator>
    <operator activated="true" class="filter_examples" compatibility="7.6.001" expanded="true" height="103" name="Filter Hold-off to Test" width="90" x="447" y="442">
    <parameter key="parameter_expression" value="date_after(oilDate, date_parse_custom(%{TrainingDateTo}, &quot;yyyy/MM/dd&quot;))"/>
    <parameter key="condition_class" value="expression"/>
    <parameter key="invert_filter" value="false"/>
    <list key="filters_list"/>
    <parameter key="filters_logic_and" value="true"/>
    <parameter key="filters_check_metadata" value="true"/>
    </operator>
    <operator activated="true" class="series:windowing" compatibility="7.4.000" expanded="true" height="82" name="Windowing (2)" width="90" x="581" y="442">
    <parameter key="series_representation" value="encode_series_by_examples"/>
    <parameter key="window_size" value="1"/>
    <parameter key="step_size" value="1"/>
    <parameter key="create_single_attributes" value="false"/>
    <parameter key="create_label" value="false"/>
    <parameter key="select_label_by_dimension" value="false"/>
    <parameter key="label_attribute" value="oilLast"/>
    <parameter key="horizon" value="%{PredictionHorizon}"/>
    <parameter key="add_incomplete_windows" value="false"/>
    <parameter key="stop_on_too_small_dataset" value="false"/>
    </operator>
    <operator activated="true" class="multiply" compatibility="7.6.001" expanded="true" height="103" name="Multiply (2)" width="90" x="715" y="442"/>
    <operator activated="true" class="apply_model" compatibility="7.6.001" expanded="true" height="82" name="Apply Model (2)" width="90" x="916" y="442">
    <list key="application_parameters"/>
    <parameter key="create_view" value="false"/>
    </operator>
    <operator activated="true" class="extract_macro" compatibility="7.6.001" expanded="true" height="68" name="Extract Macro" width="90" x="112" y="646">
    <parameter key="macro" value="n_examples"/>
    <parameter key="macro_type" value="number_of_examples"/>
    <parameter key="statistics" value="average"/>
    <parameter key="attribute_name" value=""/>
    <list key="additional_macros"/>
    <description align="center" color="transparent" colored="false" width="126">Calculate&lt;br&gt;amount of&lt;br&gt;rows of the&lt;br&gt;Windowed Test example set</description>
    </operator>
    <operator activated="true" class="generate_macro" compatibility="7.6.001" expanded="true" height="82" name="Generate Macro" width="90" x="246" y="646">
    <list key="function_descriptions">
    <parameter key="filter_range" value="eval(%{n_examples})-1"/>
    </list>
    <description align="center" color="transparent" colored="false" width="126">Set macro filter_range&lt;br&gt;to amount of rows in Test example set minus 1&lt;br&gt;(to obtain last row of the Test example set)</description>
    </operator>
    <operator activated="true" class="filter_example_range" compatibility="7.6.001" expanded="true" height="82" name="Filter Example Range" width="90" x="380" y="646">
    <parameter key="first_example" value="1"/>
    <parameter key="last_example" value="%{filter_range}"/>
    <parameter key="invert_filter" value="true"/>
    <description align="center" color="transparent" colored="false" width="126">Obtain the last row&lt;br&gt;in the Test example set</description>
    </operator>
    <operator activated="true" class="remember" compatibility="7.6.001" expanded="true" height="68" name="Remember" width="90" x="514" y="646">
    <parameter key="name" value="LastRow"/>
    <parameter key="io_object" value="ExampleSet"/>
    <parameter key="store_which" value="1"/>
    <parameter key="remove_from_process" value="true"/>
    <description align="center" color="transparent" colored="false" width="126">Remember the&lt;br&gt;last row of Test example set incl. the last date to start&lt;br&gt;the loop for&lt;br&gt;predictions on future dates</description>
    </operator>
    <operator activated="true" class="loop" compatibility="7.6.001" expanded="true" height="82" name="Loop" width="90" x="782" y="646">
    <parameter key="set_iteration_macro" value="true"/>
    <parameter key="macro_name" value="loop_forecasts"/>
    <parameter key="macro_start_value" value="1"/>
    <parameter key="iterations" value="%{PredictionHorizon}"/>
    <parameter key="limit_time" value="false"/>
    <parameter key="timeout" value="1"/>
    <process expanded="true">
    <operator activated="true" class="recall" compatibility="7.6.001" expanded="true" height="68" name="Recall" width="90" x="45" y="136">
    <parameter key="name" value="LastRow"/>
    <parameter key="io_object" value="ExampleSet"/>
    <parameter key="remove_from_store" value="false"/>
    <description align="center" color="transparent" colored="false" width="126">Recall the last row&lt;br&gt;of the Test example set to define structure of the example set that will be generated by the loop operator.&lt;br/&gt;It defines also the last Test date in order to generate new dates.</description>
    </operator>
    <operator activated="true" class="apply_model" compatibility="7.1.001" expanded="true" height="82" name="Apply Model (3)" width="90" x="246" y="34">
    <list key="application_parameters"/>
    <parameter key="create_view" value="false"/>
    </operator>
    <operator activated="true" class="generate_attributes" compatibility="7.6.001" expanded="true" height="82" name="Generate Attributes" width="90" x="380" y="34">
    <list key="function_descriptions">
    <parameter key="oilDate" value="date_add(oilDate,eval(%{loop_forecasts}),DATE_UNIT_DAY)"/>
    </list>
    <parameter key="keep_all" value="true"/>
    <description align="center" color="transparent" colored="false" width="126">Generate n future dates (one by one each loop) adjecent to&lt;br&gt;the last date of the Test example set. n = %{ PredictionHorizon}</description>
    </operator>
    <operator activated="true" class="set_role" compatibility="5.3.013" expanded="true" height="82" name="Set Role (3)" width="90" x="514" y="34">
    <parameter key="attribute_name" value="prediction(label)"/>
    <parameter key="target_role" value="regular"/>
    <list key="set_additional_roles">
    <parameter key="oilOpen-0" value="regular"/>
    <parameter key="oilHigh-0" value="regular"/>
    <parameter key="oilLow-0" value="regular"/>
    <parameter key="oilSettle-0" value="regular"/>
    <parameter key="oilVolume-0" value="regular"/>
    <parameter key="oilPrevDayOpenInt-0" value="regular"/>
    <parameter key="oilLast-0" value="regular"/>
    <parameter key="oilDate" value="id"/>
    <parameter key="oilLast-0" value="regular"/>
    </list>
    <description align="center" color="transparent" colored="false" width="126">Set the role of the prediction(label) to&lt;br/&gt;regular</description>
    </operator>
    <operator activated="true" class="select_attributes" compatibility="7.6.001" expanded="true" height="82" name="Select Attributes (3)" width="90" x="648" y="34">
    <parameter key="attribute_filter_type" value="single"/>
    <parameter key="attribute" value="prediction(label)"/>
    <parameter key="attributes" value=""/>
    <parameter key="use_except_expression" value="false"/>
    <parameter key="value_type" value="attribute_value"/>
    <parameter key="use_value_type_exception" value="false"/>
    <parameter key="except_value_type" value="time"/>
    <parameter key="block_type" value="attribute_block"/>
    <parameter key="use_block_type_exception" value="false"/>
    <parameter key="except_block_type" value="value_matrix_row_start"/>
    <parameter key="invert_selection" value="false"/>
    <parameter key="include_special_attributes" value="false"/>
    <description align="center" color="transparent" colored="false" width="126">Select the prediction(label)</description>
    </operator>
    <operator activated="true" class="replace" compatibility="7.6.001" expanded="true" height="82" name="Replace" width="90" x="782" y="34">
    <parameter key="attribute_filter_type" value="all"/>
    <parameter key="attribute" value="oilLast-0"/>
    <parameter key="attributes" value=""/>
    <parameter key="use_except_expression" value="false"/>
    <parameter key="value_type" value="nominal"/>
    <parameter key="use_value_type_exception" value="false"/>
    <parameter key="except_value_type" value="file_path"/>
    <parameter key="block_type" value="single_value"/>
    <parameter key="use_block_type_exception" value="false"/>
    <parameter key="except_block_type" value="single_value"/>
    <parameter key="invert_selection" value="false"/>
    <parameter key="include_special_attributes" value="false"/>
    <parameter key="replace_what" value="oilLast-0"/>
    <parameter key="replace_by" value="$1-"/>
    <description align="center" color="transparent" colored="false" width="126">Replace oilLast-0 value,&lt;br&gt;using backreference to the previous operator &amp;quot;$1-&amp;quot;, by the prediction(label) value</description>
    </operator>
    <operator activated="true" class="materialize_data" compatibility="7.6.001" expanded="true" height="82" name="Materialize Data (2)" width="90" x="916" y="34">
    <parameter key="datamanagement" value="double_array"/>
    <parameter key="data_management" value="auto"/>
    <description align="center" color="transparent" colored="false" width="126">Clean-up&lt;br/&gt;memory to get a clean example set</description>
    </operator>
    <connect from_port="input 1" to_op="Apply Model (3)" to_port="model"/>
    <connect from_op="Recall" from_port="result" to_op="Apply Model (3)" to_port="unlabelled data"/>
    <connect from_op="Apply Model (3)" from_port="labelled data" to_op="Generate Attributes" to_port="example set input"/>
    <connect from_op="Generate Attributes" from_port="example set output" to_op="Set Role (3)" to_port="example set input"/>
    <connect from_op="Set Role (3)" from_port="example set output" to_op="Select Attributes (3)" to_port="example set input"/>
    <connect from_op="Select Attributes (3)" from_port="example set output" to_op="Replace" to_port="example set input"/>
    <connect from_op="Replace" from_port="example set output" to_op="Materialize Data (2)" to_port="example set input"/>
    <connect from_op="Materialize Data (2)" from_port="example set output" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    <description align="center" color="transparent" colored="false" width="126">Generate in each loop a new future date and apply model on that date</description>
    </operator>
    <operator activated="true" class="append" compatibility="7.6.001" expanded="true" height="82" name="Append" width="90" x="916" y="646">
    <parameter key="datamanagement" value="double_array"/>
    <parameter key="data_management" value="auto"/>
    <parameter key="merge_type" value="all"/>
    <description align="center" color="transparent" colored="false" width="126">Append each result from loop to the future prediction example set</description>
    </operator>
    <connect from_op="Get/Join Data" from_port="out 1" to_op="Sort" to_port="example set input"/>
    <connect from_op="Sort" from_port="example set output" to_op="Multiply" to_port="input"/>
    <connect from_op="Multiply" from_port="output 1" to_op="Set Role" to_port="example set input"/>
    <connect from_op="Multiply" from_port="output 2" to_op="Set Role (2)" to_port="example set input"/>
    <connect from_op="Set Role" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
    <connect from_op="Select Attributes" from_port="example set output" to_op="Filter Start of Trend" to_port="example set input"/>
    <connect from_op="Filter Start of Trend" from_port="example set output" to_op="Train until Hold-off" to_port="example set input"/>
    <connect from_op="Train until Hold-off" from_port="example set output" to_op="Windowing" to_port="example set input"/>
    <connect from_op="Windowing" from_port="example set output" to_op="Validation" to_port="training"/>
    <connect from_op="Validation" from_port="model" to_op="Apply Model (2)" to_port="model"/>
    <connect from_op="Validation" from_port="averagable 1" to_port="result 1"/>
    <connect from_op="Set Role (2)" from_port="example set output" to_op="Select Attributes (4)" to_port="example set input"/>
    <connect from_op="Select Attributes (4)" from_port="example set output" to_op="Filter Hold-off to Test" to_port="example set input"/>
    <connect from_op="Filter Hold-off to Test" from_port="example set output" to_op="Windowing (2)" to_port="example set input"/>
    <connect from_op="Windowing (2)" from_port="example set output" to_op="Multiply (2)" to_port="input"/>
    <connect from_op="Multiply (2)" from_port="output 1" to_op="Apply Model (2)" to_port="unlabelled data"/>
    <connect from_op="Multiply (2)" from_port="output 2" to_op="Extract Macro" to_port="example set"/>
    <connect from_op="Apply Model (2)" from_port="labelled data" to_port="result 2"/>
    <connect from_op="Apply Model (2)" from_port="model" to_op="Loop" to_port="input 1"/>
    <connect from_op="Extract Macro" from_port="example set" to_op="Generate Macro" to_port="through 1"/>
    <connect from_op="Generate Macro" from_port="through 1" to_op="Filter Example Range" to_port="example set input"/>
    <connect from_op="Filter Example Range" from_port="example set output" to_op="Remember" to_port="store"/>
    <connect from_op="Loop" from_port="output 1" to_op="Append" to_port="example set 1"/>
    <connect from_op="Append" from_port="merged set" to_port="result 3"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    <portSpacing port="sink_result 3" spacing="0"/>
    <portSpacing port="sink_result 4" spacing="0"/>
    <description align="center" color="yellow" colored="true" height="134" resized="true" width="554" x="476" y="47">Process Configuration (training example set, horizon, window, holdoff example set)</description>
    <description align="center" color="green" colored="true" height="185" resized="true" width="947" x="83" y="199">Train / Validate the Time Series Model</description>
    <description align="center" color="blue" colored="true" height="169" resized="true" width="958" x="73" y="408">Test the Time Series Model</description>
    <description align="center" color="gray" colored="true" height="132" resized="true" width="299" x="84" y="49">Get source data</description>
    <description align="center" color="orange" colored="true" height="277" resized="true" width="961" x="74" y="615">Generate Future Predictions</description>
    </process>
    </operator>
    </process>

     

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    I went back a while after that original process was posted and fixed it because it wasn't generating the closing values per day correctly. I have to look for it on my other machine. 

  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    Hi @luc_bartkowski - OK I spent some time looking at your process. Maybe I'm missing something but where you are "testing" the model you are actually forecasting forward.  The output of that Apply Model operator is showing you 10-day-forward predictions of oilLast.  Right?  

     

    Screen Shot 2017-10-04 at 10.38.30 AM.png

     

    <?xml version="1.0" encoding="UTF-8"?><process version="7.6.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="7.6.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="set_macro" compatibility="7.6.001" expanded="true" height="68" name="Training From Date" width="90" x="581" y="85">
    <parameter key="macro" value="AnalysesDateFrom"/>
    <parameter key="value" value="2016/02/11"/>
    </operator>
    <operator activated="true" class="set_macro" compatibility="7.6.001" expanded="true" height="68" name="Training To Date" width="90" x="715" y="85">
    <parameter key="macro" value="TrainingDateTo"/>
    <parameter key="value" value="2017/09/10"/>
    </operator>
    <operator activated="true" class="set_macro" compatibility="7.6.001" expanded="true" height="68" name="Prediction Horizon" width="90" x="849" y="85">
    <parameter key="macro" value="PredictionHorizon"/>
    <parameter key="value" value="10"/>
    </operator>
    <operator activated="true" class="subprocess" compatibility="7.6.001" expanded="true" height="82" name="Get/Join Data" width="90" x="112" y="85">
    <process expanded="true">
    <operator activated="true" class="subprocess" compatibility="7.6.001" expanded="true" height="82" name="Oil Futures" width="90" x="313" y="34">
    <process expanded="true">
    <operator activated="false" class="jdbc_connectors:read_database" compatibility="7.6.001" expanded="true" height="68" name="Read Database (2)" width="90" x="45" y="34">
    <parameter key="connection" value="MySQL"/>
    <parameter key="query" value="SELECT *&#10;FROM `oil`&#10;ORDER BY Date desc&#10;limit 9999"/>
    <enumeration key="parameters"/>
    </operator>
    <operator activated="false" class="store" compatibility="7.6.001" expanded="true" height="68" name="Store (11)" width="90" x="179" y="34">
    <parameter key="repository_entry" value="//Cloud Repository/Samples/data/oilfuturesvw"/>
    </operator>
    <operator activated="true" class="retrieve" compatibility="7.6.001" expanded="true" height="68" name="Retrieve CHRIS-CME_CL1" width="90" x="45" y="136">
    <parameter key="repository_entry" value="//RapidMiner OneDrive/random community stuff/CHRIS-CME_CL1"/>
    </operator>
    <operator activated="true" class="select_attributes" compatibility="7.6.001" expanded="true" height="82" name="Select Attributes (2)" width="90" x="514" y="34">
    <parameter key="attribute_filter_type" value="subset"/>
    <parameter key="attributes" value="Volume|Settle|Previous Day Open Interest|Open|Low|Last|High|Date"/>
    </operator>
    <operator activated="false" class="nominal_to_date" compatibility="7.6.001" expanded="true" height="82" name="Nominal to Date (8)" width="90" x="648" y="34">
    <parameter key="attribute_name" value="Date"/>
    <parameter key="date_format" value="yyyy-MM-dd"/>
    <parameter key="time_zone" value="SYSTEM"/>
    </operator>
    <operator activated="true" class="rename" compatibility="7.6.001" expanded="true" height="82" name="Rename (8)" width="90" x="782" y="34">
    <parameter key="old_name" value="Date"/>
    <parameter key="new_name" value="oilDate"/>
    <list key="rename_additional_attributes">
    <parameter key="High" value="oilHigh"/>
    <parameter key="Low" value="oilLow"/>
    <parameter key="Open" value="oilOpen"/>
    <parameter key="Previous Day Open Interest" value="oilPrevDayOpenInt"/>
    <parameter key="Settle" value="oilSettle"/>
    <parameter key="Volume" value="oilVolume"/>
    <parameter key="Last" value="oilLast"/>
    </list>
    </operator>
    <connect from_op="Read Database (2)" from_port="output" to_op="Store (11)" to_port="input"/>
    <connect from_op="Retrieve CHRIS-CME_CL1" from_port="output" to_op="Select Attributes (2)" to_port="example set input"/>
    <connect from_op="Select Attributes (2)" from_port="example set output" to_op="Rename (8)" to_port="example set input"/>
    <connect from_op="Rename (8)" from_port="example set output" to_port="out 1"/>
    <portSpacing port="source_in 1" spacing="0"/>
    <portSpacing port="sink_out 1" spacing="0"/>
    <portSpacing port="sink_out 2" spacing="0"/>
    </process>
    </operator>
    <connect from_op="Oil Futures" from_port="out 1" to_port="out 1"/>
    <portSpacing port="source_in 1" spacing="0"/>
    <portSpacing port="sink_out 1" spacing="0"/>
    <portSpacing port="sink_out 2" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="sort" compatibility="7.6.001" expanded="true" height="82" name="Sort" width="90" x="246" y="85">
    <parameter key="attribute_name" value="oilDate"/>
    </operator>
    <operator activated="true" class="multiply" compatibility="7.6.001" expanded="true" height="103" name="Multiply" width="90" x="45" y="340"/>
    <operator activated="true" class="set_role" compatibility="7.6.001" expanded="true" height="82" name="Set Role" width="90" x="179" y="238">
    <parameter key="attribute_name" value="oilLast"/>
    <parameter key="target_role" value="label"/>
    <list key="set_additional_roles">
    <parameter key="oilDate" value="id"/>
    <parameter key="oilLast" value="regular"/>
    </list>
    </operator>
    <operator activated="true" class="select_attributes" compatibility="7.6.001" expanded="true" height="82" name="Select Attributes" width="90" x="313" y="238">
    <parameter key="attribute_filter_type" value="subset"/>
    <parameter key="attributes" value="oilHigh|oilLast|oilLow|oilOpen|oilPrevDayOpenInt|oilSettle|oilVolume"/>
    </operator>
    <operator activated="true" class="filter_examples" compatibility="7.6.001" expanded="true" height="103" name="Filter Start of Trend" width="90" x="447" y="238">
    <parameter key="parameter_expression" value="date_after(oilDate, date_parse_custom(%{AnalysesDateFrom}, &quot;yyyy/MM/dd&quot;))"/>
    <parameter key="condition_class" value="expression"/>
    <list key="filters_list"/>
    </operator>
    <operator activated="true" class="filter_examples" compatibility="7.6.001" expanded="true" height="103" name="Train until Hold-off" width="90" x="581" y="238">
    <parameter key="parameter_expression" value="date_before(oilDate, date_parse_custom(%{TrainingDateTo}, &quot;yyyy/MM/dd&quot;))"/>
    <parameter key="condition_class" value="expression"/>
    <list key="filters_list"/>
    </operator>
    <operator activated="true" class="series:windowing" compatibility="7.4.000" expanded="true" height="82" name="Windowing" width="90" x="715" y="238">
    <parameter key="window_size" value="1"/>
    <parameter key="create_label" value="true"/>
    <parameter key="label_attribute" value="oilLast"/>
    <parameter key="horizon" value="%{PredictionHorizon}"/>
    <parameter key="add_incomplete_windows" value="true"/>
    </operator>
    <operator activated="true" class="series:sliding_window_validation" compatibility="7.4.000" expanded="true" height="124" name="Validation" width="90" x="849" y="238">
    <parameter key="training_window_step_size" value="1"/>
    <parameter key="horizon" value="%{PredictionHorizon}"/>
    <parameter key="cumulative_training" value="true"/>
    <process expanded="true">
    <operator activated="true" class="support_vector_machine" compatibility="7.6.001" expanded="true" height="124" name="SVM" width="90" x="185" y="34"/>
    <connect from_port="training" to_op="SVM" to_port="training set"/>
    <connect from_op="SVM" from_port="model" to_port="model"/>
    <portSpacing port="source_training" spacing="0"/>
    <portSpacing port="sink_model" spacing="0"/>
    <portSpacing port="sink_through 1" spacing="0"/>
    </process>
    <process expanded="true">
    <operator activated="true" class="apply_model" compatibility="7.6.001" expanded="true" height="82" name="Apply Model" width="90" x="45" y="34">
    <list key="application_parameters"/>
    </operator>
    <operator activated="true" class="series:forecasting_performance" compatibility="7.4.000" expanded="true" height="82" name="Performance" width="90" x="179" y="34">
    <parameter key="horizon" value="%{PredictionHorizon}"/>
    <parameter key="main_criterion" value="prediction_trend_accuracy"/>
    <parameter key="use_example_weights" value="false"/>
    </operator>
    <connect from_port="model" to_op="Apply Model" to_port="model"/>
    <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
    <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
    <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
    <portSpacing port="source_model" spacing="0"/>
    <portSpacing port="source_test set" spacing="0"/>
    <portSpacing port="source_through 1" spacing="0"/>
    <portSpacing port="sink_averagable 1" spacing="0"/>
    <portSpacing port="sink_averagable 2" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="set_role" compatibility="7.6.001" expanded="true" height="82" name="Set Role (2)" width="90" x="179" y="442">
    <parameter key="attribute_name" value="oilLast"/>
    <parameter key="target_role" value="prediction"/>
    <list key="set_additional_roles">
    <parameter key="oilDate" value="id"/>
    <parameter key="oilLast" value="regular"/>
    </list>
    </operator>
    <operator activated="true" class="select_attributes" compatibility="7.6.001" expanded="true" height="82" name="Select Attributes (4)" width="90" x="313" y="442">
    <parameter key="attribute_filter_type" value="subset"/>
    <parameter key="attributes" value="oilHigh|oilLow|oilOpen|oilPrevDayOpenInt|oilSettle|oilVolume|oilLast"/>
    </operator>
    <operator activated="true" class="series:windowing" compatibility="7.4.000" expanded="true" height="82" name="Windowing (2)" width="90" x="447" y="442">
    <parameter key="window_size" value="1"/>
    <parameter key="create_single_attributes" value="false"/>
    <parameter key="label_attribute" value="oilLast"/>
    <parameter key="horizon" value="%{PredictionHorizon}"/>
    <parameter key="stop_on_too_small_dataset" value="false"/>
    </operator>
    <operator activated="true" class="apply_model" compatibility="7.6.001" expanded="true" height="82" name="Apply Model (2)" width="90" x="916" y="442">
    <list key="application_parameters"/>
    </operator>
    <operator activated="true" class="sort" compatibility="7.6.001" expanded="true" height="82" name="Sort (2)" width="90" x="1050" y="442">
    <parameter key="attribute_name" value="oilDate"/>
    <parameter key="sorting_direction" value="decreasing"/>
    </operator>
    <connect from_op="Get/Join Data" from_port="out 1" to_op="Sort" to_port="example set input"/>
    <connect from_op="Sort" from_port="example set output" to_op="Multiply" to_port="input"/>
    <connect from_op="Multiply" from_port="output 1" to_op="Set Role" to_port="example set input"/>
    <connect from_op="Multiply" from_port="output 2" to_op="Set Role (2)" to_port="example set input"/>
    <connect from_op="Set Role" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
    <connect from_op="Select Attributes" from_port="example set output" to_op="Filter Start of Trend" to_port="example set input"/>
    <connect from_op="Filter Start of Trend" from_port="example set output" to_op="Train until Hold-off" to_port="example set input"/>
    <connect from_op="Train until Hold-off" from_port="example set output" to_op="Windowing" to_port="example set input"/>
    <connect from_op="Windowing" from_port="example set output" to_op="Validation" to_port="training"/>
    <connect from_op="Validation" from_port="model" to_op="Apply Model (2)" to_port="model"/>
    <connect from_op="Validation" from_port="averagable 1" to_port="result 1"/>
    <connect from_op="Set Role (2)" from_port="example set output" to_op="Select Attributes (4)" to_port="example set input"/>
    <connect from_op="Select Attributes (4)" from_port="example set output" to_op="Windowing (2)" to_port="example set input"/>
    <connect from_op="Windowing (2)" from_port="example set output" to_op="Apply Model (2)" to_port="unlabelled data"/>
    <connect from_op="Apply Model (2)" from_port="labelled data" to_op="Sort (2)" to_port="example set input"/>
    <connect from_op="Sort (2)" from_port="example set output" to_port="result 2"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    <portSpacing port="sink_result 3" spacing="0"/>
    <description align="center" color="yellow" colored="true" height="134" resized="true" width="554" x="476" y="47">Process Configuration (training example set, horizon, window, holdoff example set)</description>
    <description align="center" color="green" colored="true" height="185" resized="true" width="947" x="83" y="199">Train / Validate the Time Series Model</description>
    <description align="center" color="gray" colored="true" height="132" resized="true" width="299" x="84" y="49">Get source data</description>
    </process>
    </operator>
    </process>

     

    Scott

  • luc_bartkowskiluc_bartkowski Member Posts: 46 Maven

    My model has a process parameter (top right) which sets the horizon.

    So I can play around with different horizon options.

     

    This horizon is used in the training/validation process, and also in the test process.

    I want to use the same horizon for future predictions.

    So yes, if the horizon is set to 10 then I want to forecast the Last value of Oct 8, taking into account that the last date in the training/validate/test example set is Sep 28.

     

     I suspect that the example model of Thomas is working only on horizon = 1. I therefore have altered my model.

    My altered model selects the last n values from the test example set and puts it in a "Loop Examples" subprocess.

    So the subprocess in "Loop examples" get the values to calculate the prediction(label) for future oilDates.

    In the "Loop examples" subprocess I have also managed to alter de Date e.g. oilDate N days ahead. N=horizon again.

    But then I'm stuck, don't know what to do / which operators to use, to get the desired predictions for future dates.

    Please find the altered model in the following XML.

     

    <?xml version="1.0" encoding="UTF-8"?><process version="7.6.001">
    <operator activated="false" class="replace" compatibility="7.6.001" expanded="true" height="82" name="Replace (3)" width="90" x="514" y="289">
    <parameter key="attribute_filter_type" value="all"/>
    <parameter key="attribute" value="oilLast-0"/>
    <parameter key="attributes" value=""/>
    <parameter key="use_except_expression" value="false"/>
    <parameter key="value_type" value="nominal"/>
    <parameter key="use_value_type_exception" value="false"/>
    <parameter key="except_value_type" value="file_path"/>
    <parameter key="block_type" value="single_value"/>
    <parameter key="use_block_type_exception" value="false"/>
    <parameter key="except_block_type" value="single_value"/>
    <parameter key="invert_selection" value="false"/>
    <parameter key="include_special_attributes" value="false"/>
    <parameter key="replace_what" value="oilLast-0"/>
    <parameter key="replace_by" value="$1-"/>
    <description align="center" color="transparent" colored="false" width="126">Replace oilLast-0 value,&lt;br&gt;using backreference to the previous operator &amp;quot;$1-&amp;quot;, by the prediction(label) value</description>
    </operator>
    </process>

     

     

  • luc_bartkowskiluc_bartkowski Member Posts: 46 Maven

    @Thomas_Ott

     

    Dear Thomas, I agree.

    I realized that myself also. You solved another problem, independant of the process.

    But still, it was the only template for a solution. I was happy to find any template for a solution regarding the topic.

    Dispite all information, toturials, tempates, blogs, videos on the web regarding Time Series Forecasting with RapidMiner you pointed me to a possible solution. Because of your posts and I thank you for that. Again, you're doing a great job, learned a lot from you, thank you.

    Best regards,

    Luc

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    @luc_bartkowski Thank you for your kind words.  I have a bunch of time series processes tha I should just organize and repost. They are super important because Time Series in RapidMiner is not very organized (as of yet) but the development team and Community have made progress. 

     

     

  • luc_bartkowskiluc_bartkowski Member Posts: 46 Maven

    Hi Scott,

     

    I know why you are able to predict until Oct 3rd in your picture. That is because you downloaded the Quandl source data yesterday on Oct 4th.

    Your source data includes values for oilOpen, etc. on Oct 3rd. That is the reason Oct 3rd, including a valuable prediction, is visible in your picture. But your picture doesn't show predictions beyond Oct 3rd, whatever your horizon is.

    I'm sorry but I therefore cannot hit the "Solved" button on this topic.

     

    I'm beginning to suspect that the problem, addressed in this topic, is:

    The "Apply model" operator (also) always needs a Label to calculate predictions.

    Because such Label is not available for Future dates, the "Apply model" operator will never be able to calculate predictions for future dates.

     

    To illustrate this conclusion take again a look to my first picture in this post. In order to generate this picture I added future dates with fake values ("0" e.g. zero) for all attributes beyond Sep 28, including zero values for the Labels (oilLast) on Sep 29 to Oct 5. The "Apply model" operator uses these future (fake) Labels to predict on these future dates. Therefore all predictions beyond Sep 28 have a value of 7.667, based on a Label with a "0" e.g. zero value for these future dates. As stated before: I suspect that "Apply model" always needs a valuable Label in order to predict.

     

    Either that is the explaination of the problem addressed in this topic or my implementation of "Apply model" is not correct.

    If the latter is the case, please send as a reply an example model in XML that implements an "Apply model" operator that will predict beyond the scope of a source data set. 

     

    Best regards,

    Luc

  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    OK I spent some time on this.  Let me know what you think.

     

    <?xml version="1.0" encoding="UTF-8"?><process version="7.6.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="7.6.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="subprocess" compatibility="7.6.001" expanded="true" height="82" name="Get/Join Data" width="90" x="112" y="85">
    <process expanded="true">
    <operator activated="true" class="subprocess" compatibility="7.6.001" expanded="true" height="82" name="Oil Futures" width="90" x="313" y="34">
    <process expanded="true">
    <operator activated="false" class="jdbc_connectors:read_database" compatibility="7.6.001" expanded="true" height="68" name="Read Database (2)" width="90" x="45" y="34">
    <parameter key="connection" value="MySQL"/>
    <parameter key="query" value="SELECT *&#10;FROM `oil`&#10;ORDER BY Date desc&#10;limit 9999"/>
    <enumeration key="parameters"/>
    </operator>
    <operator activated="false" class="store" compatibility="7.6.001" expanded="true" height="68" name="Store (11)" width="90" x="179" y="34">
    <parameter key="repository_entry" value="//Cloud Repository/Samples/data/oilfuturesvw"/>
    </operator>
    <operator activated="true" class="retrieve" compatibility="7.6.001" expanded="true" height="68" name="Retrieve CHRIS-CME_CL1" width="90" x="45" y="136">
    <parameter key="repository_entry" value="//RapidMiner OneDrive/random community stuff/CHRIS-CME_CL1"/>
    </operator>
    <operator activated="true" class="select_attributes" compatibility="7.6.001" expanded="true" height="82" name="Select Attributes (2)" width="90" x="514" y="34">
    <parameter key="attribute_filter_type" value="subset"/>
    <parameter key="attributes" value="Volume|Settle|Previous Day Open Interest|Open|Low|Last|High|Date"/>
    </operator>
    <operator activated="false" class="nominal_to_date" compatibility="7.6.001" expanded="true" height="82" name="Nominal to Date (8)" width="90" x="648" y="34">
    <parameter key="attribute_name" value="Date"/>
    <parameter key="date_format" value="yyyy-MM-dd"/>
    </operator>
    <operator activated="true" class="rename" compatibility="7.6.001" expanded="true" height="82" name="Rename (8)" width="90" x="782" y="34">
    <parameter key="old_name" value="Date"/>
    <parameter key="new_name" value="oilDate"/>
    <list key="rename_additional_attributes">
    <parameter key="High" value="oilHigh"/>
    <parameter key="Low" value="oilLow"/>
    <parameter key="Open" value="oilOpen"/>
    <parameter key="Previous Day Open Interest" value="oilPrevDayOpenInt"/>
    <parameter key="Settle" value="oilSettle"/>
    <parameter key="Volume" value="oilVolume"/>
    <parameter key="Last" value="oilLast"/>
    </list>
    </operator>
    <connect from_op="Read Database (2)" from_port="output" to_op="Store (11)" to_port="input"/>
    <connect from_op="Retrieve CHRIS-CME_CL1" from_port="output" to_op="Select Attributes (2)" to_port="example set input"/>
    <connect from_op="Select Attributes (2)" from_port="example set output" to_op="Rename (8)" to_port="example set input"/>
    <connect from_op="Rename (8)" from_port="example set output" to_port="out 1"/>
    <portSpacing port="source_in 1" spacing="0"/>
    <portSpacing port="sink_out 1" spacing="0"/>
    <portSpacing port="sink_out 2" spacing="0"/>
    </process>
    </operator>
    <connect from_op="Oil Futures" from_port="out 1" to_port="out 1"/>
    <portSpacing port="source_in 1" spacing="0"/>
    <portSpacing port="sink_out 1" spacing="0"/>
    <portSpacing port="sink_out 2" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="sort" compatibility="7.6.001" expanded="true" height="82" name="Sort" width="90" x="246" y="85">
    <parameter key="attribute_name" value="oilDate"/>
    </operator>
    <operator activated="true" class="set_macro" compatibility="7.6.001" expanded="true" height="82" name="Analyze From Date" width="90" x="447" y="85">
    <parameter key="macro" value="AnalysesDateFrom"/>
    <parameter key="value" value="2014/02/11"/>
    </operator>
    <operator activated="true" class="set_macro" compatibility="7.6.001" expanded="true" height="82" name="Training To Date" width="90" x="581" y="85">
    <parameter key="macro" value="TrainingDateTo"/>
    <parameter key="value" value="2017/09/10"/>
    </operator>
    <operator activated="true" class="set_macro" compatibility="7.6.001" expanded="true" height="82" name="Prediction Horizon" width="90" x="715" y="85">
    <parameter key="macro" value="PredictionHorizon"/>
    <parameter key="value" value="10"/>
    </operator>
    <operator activated="true" class="filter_examples" compatibility="7.6.001" expanded="true" height="103" name="Filter Analysis Data" width="90" x="849" y="85">
    <parameter key="parameter_expression" value="date_after(oilDate, date_parse_custom(%{AnalysesDateFrom}, &quot;yyyy/MM/dd&quot;))"/>
    <parameter key="condition_class" value="expression"/>
    <list key="filters_list"/>
    </operator>
    <operator activated="true" class="split_data" compatibility="7.6.001" expanded="true" height="103" name="Split Data" width="90" x="983" y="85">
    <enumeration key="partitions">
    <parameter key="ratio" value="0.8"/>
    <parameter key="ratio" value="0.2"/>
    </enumeration>
    <parameter key="sampling_type" value="linear sampling"/>
    </operator>
    <operator activated="true" class="set_role" compatibility="7.6.001" expanded="true" height="82" name="Set Role" width="90" x="179" y="238">
    <parameter key="attribute_name" value="oilLast"/>
    <list key="set_additional_roles">
    <parameter key="oilDate" value="id"/>
    </list>
    </operator>
    <operator activated="true" class="select_attributes" compatibility="7.6.001" expanded="true" height="82" name="Select Attributes" width="90" x="313" y="238">
    <parameter key="attribute_filter_type" value="subset"/>
    <parameter key="attributes" value="oilHigh|oilLast|oilLow|oilOpen|oilPrevDayOpenInt|oilSettle|oilVolume"/>
    </operator>
    <operator activated="true" class="series:windowing" compatibility="7.4.000" expanded="true" height="82" name="Windowing" width="90" x="447" y="238">
    <parameter key="window_size" value="1"/>
    <parameter key="create_label" value="true"/>
    <parameter key="label_attribute" value="oilLast"/>
    <parameter key="horizon" value="%{PredictionHorizon}"/>
    <parameter key="add_incomplete_windows" value="true"/>
    </operator>
    <operator activated="true" class="rename_by_replacing" compatibility="7.6.001" expanded="true" height="82" name="Rename by Replacing" width="90" x="581" y="238">
    <parameter key="replace_what" value="[-]0"/>
    </operator>
    <operator activated="true" class="rename" compatibility="7.6.001" expanded="true" height="82" name="Rename" width="90" x="715" y="238">
    <parameter key="old_name" value="label"/>
    <parameter key="new_name" value="%{PredictionHorizon} days forward"/>
    <list key="rename_additional_attributes"/>
    </operator>
    <operator activated="true" class="series:sliding_window_validation" compatibility="7.4.000" expanded="true" height="124" name="Validation" width="90" x="849" y="238">
    <parameter key="training_window_step_size" value="1"/>
    <parameter key="horizon" value="%{PredictionHorizon}"/>
    <parameter key="cumulative_training" value="true"/>
    <process expanded="true">
    <operator activated="true" class="support_vector_machine" compatibility="7.6.001" expanded="true" height="124" name="SVM" width="90" x="185" y="34"/>
    <connect from_port="training" to_op="SVM" to_port="training set"/>
    <connect from_op="SVM" from_port="model" to_port="model"/>
    <portSpacing port="source_training" spacing="0"/>
    <portSpacing port="sink_model" spacing="0"/>
    <portSpacing port="sink_through 1" spacing="0"/>
    </process>
    <process expanded="true">
    <operator activated="true" class="apply_model" compatibility="7.6.001" expanded="true" height="82" name="Apply Model" width="90" x="45" y="34">
    <list key="application_parameters"/>
    </operator>
    <operator activated="true" class="series:forecasting_performance" compatibility="7.4.000" expanded="true" height="82" name="Performance" width="90" x="179" y="34">
    <parameter key="horizon" value="%{PredictionHorizon}"/>
    <parameter key="main_criterion" value="prediction_trend_accuracy"/>
    <parameter key="use_example_weights" value="false"/>
    </operator>
    <connect from_port="model" to_op="Apply Model" to_port="model"/>
    <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
    <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
    <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
    <portSpacing port="source_model" spacing="0"/>
    <portSpacing port="source_test set" spacing="0"/>
    <portSpacing port="source_through 1" spacing="0"/>
    <portSpacing port="sink_averagable 1" spacing="0"/>
    <portSpacing port="sink_averagable 2" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="set_role" compatibility="7.6.001" expanded="true" height="82" name="Set Role (3)" width="90" x="179" y="442">
    <parameter key="attribute_name" value="oilLast"/>
    <list key="set_additional_roles">
    <parameter key="oilDate" value="id"/>
    </list>
    </operator>
    <operator activated="true" class="select_attributes" compatibility="7.6.001" expanded="true" height="82" name="Select Attributes (3)" width="90" x="313" y="442">
    <parameter key="attribute_filter_type" value="subset"/>
    <parameter key="attributes" value="oilHigh|oilLast|oilLow|oilOpen|oilPrevDayOpenInt|oilSettle|oilVolume"/>
    </operator>
    <operator activated="true" class="series:windowing" compatibility="7.4.000" expanded="true" height="82" name="Windowing (2)" width="90" x="447" y="442">
    <parameter key="window_size" value="1"/>
    <parameter key="create_label" value="true"/>
    <parameter key="label_attribute" value="oilLast"/>
    <parameter key="horizon" value="%{PredictionHorizon}"/>
    <parameter key="add_incomplete_windows" value="true"/>
    </operator>
    <operator activated="true" class="rename_by_replacing" compatibility="7.6.001" expanded="true" height="82" name="Rename by Replacing (2)" width="90" x="581" y="442">
    <parameter key="replace_what" value="[-]0"/>
    </operator>
    <operator activated="true" class="rename" compatibility="7.6.001" expanded="true" height="82" name="Rename (3)" width="90" x="715" y="442">
    <parameter key="old_name" value="label"/>
    <parameter key="new_name" value="%{PredictionHorizon} days forward"/>
    <list key="rename_additional_attributes"/>
    </operator>
    <operator activated="true" class="apply_model" compatibility="7.6.001" expanded="true" height="82" name="Apply Model (3)" width="90" x="983" y="442">
    <list key="application_parameters"/>
    </operator>
    <operator activated="true" class="apply_model" compatibility="7.6.001" expanded="true" height="82" name="Apply Model (2)" width="90" x="1050" y="697">
    <list key="application_parameters"/>
    </operator>
    <operator activated="true" class="performance" compatibility="7.6.001" expanded="true" height="82" name="Performance (3)" width="90" x="1117" y="442">
    <parameter key="use_example_weights" value="false"/>
    </operator>
    <operator activated="true" class="select_attributes" compatibility="7.6.001" expanded="true" height="82" name="Select Attributes (4)" width="90" x="1251" y="595">
    <parameter key="attribute_filter_type" value="subset"/>
    <parameter key="attributes" value="10 days forward|oilDate"/>
    <parameter key="include_special_attributes" value="true"/>
    </operator>
    <operator activated="true" class="join" compatibility="7.6.001" expanded="true" height="82" name="Join" width="90" x="1385" y="697">
    <parameter key="join_type" value="right"/>
    <list key="key_attributes"/>
    </operator>
    <operator activated="true" class="sort" compatibility="7.6.001" expanded="true" height="82" name="Sort (2)" width="90" x="1519" y="697">
    <parameter key="attribute_name" value="oilDate"/>
    <parameter key="sorting_direction" value="decreasing"/>
    </operator>
    <connect from_op="Get/Join Data" from_port="out 1" to_op="Sort" to_port="example set input"/>
    <connect from_op="Sort" from_port="example set output" to_op="Analyze From Date" to_port="through 1"/>
    <connect from_op="Analyze From Date" from_port="through 1" to_op="Training To Date" to_port="through 1"/>
    <connect from_op="Training To Date" from_port="through 1" to_op="Prediction Horizon" to_port="through 1"/>
    <connect from_op="Prediction Horizon" from_port="through 1" to_op="Filter Analysis Data" to_port="example set input"/>
    <connect from_op="Filter Analysis Data" from_port="example set output" to_op="Split Data" to_port="example set"/>
    <connect from_op="Split Data" from_port="partition 1" to_op="Set Role" to_port="example set input"/>
    <connect from_op="Split Data" from_port="partition 2" to_op="Set Role (3)" to_port="example set input"/>
    <connect from_op="Set Role" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
    <connect from_op="Select Attributes" from_port="example set output" to_op="Windowing" to_port="example set input"/>
    <connect from_op="Windowing" from_port="example set output" to_op="Rename by Replacing" to_port="example set input"/>
    <connect from_op="Rename by Replacing" from_port="example set output" to_op="Rename" to_port="example set input"/>
    <connect from_op="Rename" from_port="example set output" to_op="Validation" to_port="training"/>
    <connect from_op="Validation" from_port="model" to_op="Apply Model (3)" to_port="model"/>
    <connect from_op="Validation" from_port="averagable 1" to_port="result 1"/>
    <connect from_op="Set Role (3)" from_port="example set output" to_op="Select Attributes (3)" to_port="example set input"/>
    <connect from_op="Select Attributes (3)" from_port="example set output" to_op="Windowing (2)" to_port="example set input"/>
    <connect from_op="Windowing (2)" from_port="example set output" to_op="Rename by Replacing (2)" to_port="example set input"/>
    <connect from_op="Windowing (2)" from_port="original" to_op="Apply Model (2)" to_port="unlabelled data"/>
    <connect from_op="Rename by Replacing (2)" from_port="example set output" to_op="Rename (3)" to_port="example set input"/>
    <connect from_op="Rename (3)" from_port="example set output" to_op="Apply Model (3)" to_port="unlabelled data"/>
    <connect from_op="Apply Model (3)" from_port="labelled data" to_op="Performance (3)" to_port="labelled data"/>
    <connect from_op="Apply Model (3)" from_port="model" to_op="Apply Model (2)" to_port="model"/>
    <connect from_op="Apply Model (2)" from_port="labelled data" to_op="Join" to_port="right"/>
    <connect from_op="Performance (3)" from_port="performance" to_port="result 2"/>
    <connect from_op="Performance (3)" from_port="example set" to_op="Select Attributes (4)" to_port="example set input"/>
    <connect from_op="Select Attributes (4)" from_port="example set output" to_op="Join" to_port="left"/>
    <connect from_op="Join" from_port="join" to_op="Sort (2)" to_port="example set input"/>
    <connect from_op="Sort (2)" from_port="example set output" to_port="result 3"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    <portSpacing port="sink_result 3" spacing="0"/>
    <portSpacing port="sink_result 4" spacing="0"/>
    <description align="center" color="yellow" colored="true" height="161" resized="true" width="706" x="392" y="47">Process Configuration (training example set, horizon, window, holdoff example set)</description>
    <description align="center" color="green" colored="true" height="205" resized="true" width="1074" x="83" y="199">Train / Validate the Time Series Model</description>
    <description align="center" color="gray" colored="true" height="132" resized="true" width="299" x="84" y="49">Get source data</description>
    <description align="center" color="red" colored="true" height="177" resized="true" width="1228" x="89" y="400">Test Model</description>
    <description align="center" color="yellow" colored="false" height="229" resized="true" width="814" x="1002" y="582">Forecasting</description>
    </process>
    </operator>
    </process>

    Scott

  • luc_bartkowskiluc_bartkowski Member Posts: 46 Maven

    I have done some additional testing.

     

    The model that Thomas repaired regarding Remember/Recall was based on an implementation of shifting dates.

    I noticed that RapidMiner uses global implementations of java variables. Macro's aren't storage locations, the're pointers to global variables.

    Using that knowledge I changed the dates elsewhere, in front of the Loop operator. These values won't change in Loop because the're global.

    I included the results in the next picture. One can change dates, shift that date forwards, backwards, anywhere. The pictures are based on a horizon of 10 days. The Apply model will just use these dates as an ID. I noticed that in my previous topic. But the "Apply model" operator doesn't predict beyond the scope of the source data set, whatever the date of those examples are or will be whitin the process.

    Because of the absence of a future Label in the scope of the source data set.

     

    preddatemin.jpegShifting dates 10 days backwards, Sep 28 becomes Sep 18

     

    preddateplus.jpegSept 28 becomes Oct 8Example of a glo

     

  • luc_bartkowskiluc_bartkowski Member Posts: 46 Maven

    Well, the only thing I did in this XML process is to change the source date to my MySQL based examples sets.

    As you know that source example set has values untill Sep 28.

    These are the results. See the following pictures.

     

    The only result example set in this process is provided by the operator Sort (2).

    The scope of that prediction does not go beyond the source data set, in my source data set Sep 28.

     

    predictionssort(2).jpegSort (2) result example set.

     

    predictionscotttest.jpeg

  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    Hello.  So I guess by your posts that you did run the process I built.  The predictions for 10 days forward are there in that screenshot - they are just not in new rows.  If you look at the column labeled "prediction(10 days forward)", that column represents the predicted price of oilLast 10 days AFTER the date listed in oilDate.  So for example, on September 15, prediction(10 days forward)=50.077.  Hence this is the prediction for oilLast for 10 days after September 15.  By my calculations, this is not Sept 25 because these prices are only listed 5 out of 7 days per week.  Hence this is showing that oilLast, according to this model, will be 50.077 on Sept 29 and so on...

     

    Oct 12: 52.473

    Oct 11: 52.258

    Oct 10: 52.336

    Oct 9: 51.892

    Oct 6: 50.454

    Oct 5: 50.648

    Oct 4: 49.682

    Oct 3: 49.479

    Oct 2: 49.606

    Sept 29: 50.077

     

    That's why you see no values in the "10 days forward" column there - it has not happened yet in your data set.  Yes I could have spent some time moving all that around so that it actually looks like what I typed above...  :)

     

    Scott

  • luc_bartkowskiluc_bartkowski Member Posts: 46 Maven

    Solved with ARIMA Trainer & Apply Forecast.

     

    Thank you for your support. ?

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,503 RM Data Scientist

    Dear @luc_bartkowski,

     

    if you have any feedback on the ARIMA operators, please post it here with @tftemme in "CC". We are happy for any feedback on this extension which is work in progress.

     

    Cheers,

    Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • luc_bartkowskiluc_bartkowski Member Posts: 46 Maven
  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    Very nice @luc_bartkowski!

  • luc_bartkowskiluc_bartkowski Member Posts: 46 Maven

    The nice thing about prediction operators like svm and neural nets is that they are multivariable.

    In stock trading terms: Amplitudes of the Moving Average and trading volumes have probably a corrolation.

    ARIMA is univariable but the only operator able to predict a real future.

     

    What I am going to do to enable multivariable future predictions is:

    To feed the multivariable prediction operator with real multivariable data and adjectently all of their univariable related predictions, the prediction output of an ARIMA model. I will train that model with real data. Yes, therefore I have to wait untill the future is past and I have obtained the labels to train to. Yes, I know, the resulting prediction will have a lag. The label data cannot be newer than now(). We all don't have real multivariable data from the future. But one can optimize a prediction.

     

    What happens if q,d,p used in ARIMA change? Well, I guess that the multivariable prediction operator will get improved data to train its model untill now() with training data for the future minus the horizon. It is and will be always the future we want to predict.We have to make a guess. We ask therefore ARIMA a prediction, it's ARIMA's best guess. The multivariable prediction operator will train on it with a target label until now() aka prediction horizon minus horizon. 

Sign In or Register to comment.