Holt-Winters predicts over a 30days period as opposed to months

BarclaeysBarclaeys Member Posts: 18 Learner I
hi, I have just built a Holt-Winters model with Apply forecast on monthly aggregated data. My month is of Type "Date" and is shown as Sep 1, 2015 / Oct 1, 2015 / Nov 1, 2015...

When I look at the prediction, the result set does not predict for Jun 1, 2020 / Jul 1, 2020 / Aug 1, 2020 but takes Jun 1, 2020 / Jul 30, 2020 / Aug 29 2020 / Sep 28 2020 so it seams to keep a fixed 30 days rather than doing the forecast for the 1st of the month. 

Is this normal and is there any way I can change this?
Thank you!

Best Answer

  • Options
    MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,520 RM Data Scientist
    edited August 2020 Solution Accepted
    it somewhat is. Those methods assume, that your data is equidistant. Sadly months (and years) are not equidistant. So what the algorithm takes is i think the average of the steps, since we prefer to give you a solution over an error.
    I think @tftemme and team is working on finding a nice solution for those "calendar types". For now there is not a real solution, other than using a generic ID, like I do in my Auto-Forecasting project: https://community.rapidminer.com/discussion/comment/66543#Comment_66543

    Alternativly one can think about a postprocessing to adjust the date to the next 1st of the Month.


    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany


  • Options
    BarclaeysBarclaeys Member Posts: 18 Learner I
    Martin, thanks for the feedback. this answers my question. Would you have an example of such postprocessing option?
  • Options
    MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,520 RM Data Scientist
    sure! Attached is an example doing it. This works until you forecast more than 15 months, which is i think kind of rare.

    <?xml version="1.0" encoding="UTF-8"?><process version="9.7.002">
      <operator activated="true" class="process" compatibility="9.7.002" expanded="true" name="Process">
        <parameter key="logverbosity" value="init"/>
        <parameter key="random_seed" value="2001"/>
        <parameter key="send_mail" value="never"/>
        <parameter key="notification_email" value=""/>
        <parameter key="process_duration_for_mail" value="30"/>
        <parameter key="encoding" value="SYSTEM"/>
        <process expanded="true">
          <operator activated="true" class="utility:create_exampleset" compatibility="9.7.002" expanded="true" height="68" name="Create ExampleSet" width="90" x="45" y="136">
            <parameter key="generator_type" value="attribute functions"/>
            <parameter key="number_of_examples" value="100"/>
            <parameter key="use_stepsize" value="false"/>
            <list key="function_descriptions">
              <parameter key="date" value="date_add(date_parse_custom(&quot;01/01/2010&quot;,&quot;dd/MM/yyyy&quot;),id,DATE_UNIT_MONTH)"/>
              <parameter key="value" value="rand()"/>
            <parameter key="add_id_attribute" value="true"/>
            <list key="numeric_series_configuration"/>
            <list key="date_series_configuration"/>
            <list key="date_series_configuration (interval)"/>
            <parameter key="date_format" value="yyyy-MM-dd HH:mm:ss"/>
            <parameter key="time_zone" value="SYSTEM"/>
            <parameter key="column_separator" value=","/>
            <parameter key="parse_all_as_nominal" value="false"/>
            <parameter key="decimal_point_character" value="."/>
            <parameter key="trim_attribute_names" value="true"/>
          <operator activated="true" class="time_series:arima_trainer" compatibility="9.7.000" expanded="true" height="103" name="ARIMA" width="90" x="246" y="136">
            <parameter key="time_series_attribute" value="value"/>
            <parameter key="has_indices" value="true"/>
            <parameter key="indices_attribute" value="date"/>
            <parameter key="p:_order_of_the_autoregressive_model" value="1"/>
            <parameter key="d:_degree_of_differencing" value="0"/>
            <parameter key="q:_order_of_the_moving-average_model" value="1"/>
            <parameter key="estimate_constant" value="true"/>
            <parameter key="main_criterion" value="aic"/>
          <operator activated="true" class="time_series:apply_forecast" compatibility="9.7.000" expanded="true" height="82" name="Apply Forecast" width="90" x="380" y="136">
            <parameter key="forecast_horizon" value="5"/>
            <parameter key="add_original_time_series" value="true"/>
            <parameter key="add_combined_time_series" value="true"/>
          <operator activated="true" class="generate_attributes" compatibility="9.7.002" expanded="true" height="82" name="Generate Attributes" width="90" x="514" y="136">
            <list key="function_descriptions">
              <parameter key="delta_this_month" value="date_diff(&#10;date,date_set(date,1,DATE_UNIT_DAY)&#10;)/1000/60/60/24"/>
              <parameter key="delta_next_month" value="date_diff(&#10;date,&#10;&#10;date_add(&#10;&#9;date_set(&#10;&#9;&#9;date,1,DATE_UNIT_DAY&#10;&#9;),&#10;&#9;1,&#10;&#9;DATE_UNIT_MONTH)&#10;&#9;&#10;)&#9;/1000/60/60/24"/>
              <parameter key="adjusted_date" value="if(&#10;&#9;abs(delta_this_month)&lt;abs(delta_next_month),&#10;&#9;date_set(date,1,DATE_UNIT_DAY),&#10;&#9;date_add(date_set(date,1,DATE_UNIT_DAY),1,DATE_UNIT_MONTH)&#10;&#9;)"/>
            <parameter key="keep_all" value="true"/>
          <connect from_op="Create ExampleSet" from_port="output" to_op="ARIMA" to_port="example set"/>
          <connect from_op="ARIMA" from_port="forecast model" to_op="Apply Forecast" to_port="forecast model"/>
          <connect from_op="Apply Forecast" from_port="example set" to_op="Generate Attributes" to_port="example set input"/>
          <connect from_op="Generate Attributes" from_port="example set output" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • Options
    BarclaeysBarclaeys Member Posts: 18 Learner I
    Just beautiful. And I only copied the XML into the RapidMinder GUI with some tweaks to the expression. Really great.
Sign In or Register to comment.