RapidMiner

Time Series Forecasting - Demo

Contributor II

Time Series Forecasting - Demo

Hello all, I'm testing RapidMiner features and I need to build a simple Forecast model. I did a little research and I found a few videos and tutorials about that but all of them test the Performance or Accuracy of the models and never get a concrete result. Does anyone have a tutorial to build a sales forecast (Only having Date and Amount  as Input) and Predict 1 month ahead?.

Thanks in advance.

PS: I'm sorry if I choose the wrong topic.
5 REPLIES
Elite III

Re: Time Series Forecasting - Demo

I quite like the Simafore blog.  They have lots of practical examples on how to use RapidMiner to do various tasks in a business context:

Here is part one of their two part Time Series tutorial. 
http://www.simafore.com/blog/bid/106430/Using-RapidMiner-for-time-series-forecasting-in-cost-modelin...

Hope that helps.
-- Training, Consulting, Sales in China, Hong Kong & Taiwan --
www.RapidMinerChina.com
Contributor II

Re: Time Series Forecasting - Demo

Thanks JEdward, actually I had already tried to adapt that tutorial to my sample data but I can't really predict anything. I set the Horizon value to 1 or 5 but It just shows my Data and not new Days with the amount.
Maybe I Misunderstood the Goals of this kind of tool but in Microsoft Mining I can get new predicted months/days based on my historical Data, is that possible?

Thanks again, have a nive day!
Elite III

Re: Time Series Forecasting - Demo

Okay.  If I get you correctly, you have the values say:






YearPopulation
19995m
20006m
20014.5m
20027m


And you want to predict the population in say... 2053.  For that a very simple way is to generate a trend from the data you have and apply it to your unseen values. 
Here's an simple example for demonstration with no model validation to test accuracy. 
It uses the World Bank population data of Germany and it calculates based on the trend that the population will be 94,494,157 in 2053. 
To get the World Bank operator install the Rapid Finance extension from the marketplace. 

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.015">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.3.015" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="subprocess" compatibility="5.3.015" expanded="true" height="94" name="GetTrend" width="90" x="179" y="30">
        <process expanded="true">
          <operator activated="true" class="quantx1:world_bank_data_extractor" compatibility="1.0.006" expanded="true" height="60" name="World Bank Data Extractor" width="90" x="45" y="30">
            <parameter key="Select Indicator" value="SP.POP.TOTL"/>
            <parameter key="Start Year" value="1950"/>
            <parameter key="End Year" value="2000"/>
            <parameter key="Country" value="Germany"/>
          </operator>
          <operator activated="true" class="select_attributes" compatibility="5.3.015" expanded="true" height="76" name="Select Attributes" width="90" x="180" y="30">
            <parameter key="attribute_filter_type" value="subset"/>
            <parameter key="attributes" value="|Year|Value"/>
          </operator>
          <operator activated="true" class="set_role" compatibility="5.3.015" expanded="true" height="76" name="Set Role" width="90" x="315" y="30">
            <parameter key="attribute_name" value="Year"/>
            <parameter key="target_role" value="id"/>
            <list key="set_additional_roles">
              <parameter key="Value" value="label"/>
            </list>
          </operator>
          <operator activated="true" class="series:fit_trend" compatibility="5.3.000" expanded="true" height="60" name="Fit Trend" width="90" x="313" y="165">
            <parameter key="attribute" value="Value"/>
            <process expanded="true">
              <operator activated="true" class="neural_net" compatibility="5.3.015" expanded="true" name="Neural Net (2)">
                <list key="hidden_layers"/>
              </operator>
              <operator activated="true" class="remember" compatibility="5.3.015" expanded="true" name="Remember">
                <parameter key="name" value="Trend"/>
                <parameter key="io_object" value="Model"/>
              </operator>
              <connect from_port="example set" to_op="Neural Net (2)" to_port="training set"/>
              <connect from_op="Neural Net (2)" from_port="model" to_op="Remember" to_port="store"/>
              <connect from_op="Remember" from_port="stored" to_port="model"/>
              <portSpacing port="source_example set" spacing="0"/>
              <portSpacing port="sink_model" spacing="0"/>
            </process>
          </operator>
          <connect from_op="World Bank Data Extractor" from_port="example set" to_op="Select Attributes" to_port="example set input"/>
          <connect from_op="Select Attributes" from_port="example set output" to_op="Set Role" to_port="example set input"/>
          <connect from_op="Set Role" from_port="example set output" to_op="Fit Trend" to_port="example set"/>
          <connect from_op="Fit Trend" from_port="example set with trend" to_port="out 1"/>
          <portSpacing port="source_in 1" spacing="0"/>
          <portSpacing port="sink_out 1" spacing="0"/>
          <portSpacing port="sink_out 2" spacing="0"/>
          <portSpacing port="sink_out 3" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="generate_data_user_specification" compatibility="5.3.015" expanded="true" height="60" name="EnterYearHere" width="90" x="45" y="165">
        <list key="attribute_values">
          <parameter key="Year" value="2053"/>
        </list>
        <list key="set_additional_roles">
          <parameter key="Year" value="id"/>
        </list>
      </operator>
      <operator activated="true" class="subprocess" compatibility="5.3.015" expanded="true" height="94" name="ApplyTrend" width="90" x="179" y="165">
        <process expanded="true">
          <operator activated="true" class="recall" compatibility="5.3.015" expanded="true" height="60" name="Recall" width="90" x="180" y="30">
            <parameter key="name" value="Trend"/>
            <parameter key="io_object" value="Model"/>
            <parameter key="remove_from_store" value="false"/>
          </operator>
          <operator activated="true" class="apply_model" compatibility="5.3.015" expanded="true" height="76" name="Apply Model" width="90" x="315" y="30">
            <list key="application_parameters"/>
          </operator>
          <connect from_port="in 1" to_op="Apply Model" to_port="unlabelled data"/>
          <connect from_op="Recall" from_port="result" to_op="Apply Model" to_port="model"/>
          <connect from_op="Apply Model" from_port="labelled data" to_port="out 1"/>
          <portSpacing port="source_in 1" spacing="0"/>
          <portSpacing port="source_in 2" spacing="0"/>
          <portSpacing port="source_in 3" spacing="0"/>
          <portSpacing port="sink_out 1" spacing="0"/>
          <portSpacing port="sink_out 2" spacing="0"/>
        </process>
      </operator>
      <connect from_op="GetTrend" from_port="out 1" to_port="result 1"/>
      <connect from_op="GetTrend" from_port="out 2" to_op="ApplyTrend" to_port="in 2"/>
      <connect from_op="EnterYearHere" from_port="output" to_op="ApplyTrend" to_port="in 1"/>
      <connect from_op="ApplyTrend" from_port="out 1" to_port="result 2"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <portSpacing port="sink_result 3" spacing="0"/>
    </process>
  </operator>
</process>


Is this closer to what you are after? 
-- Training, Consulting, Sales in China, Hong Kong & Taiwan --
www.RapidMinerChina.com
Contributor II

Re: Time Series Forecasting - Demo

Brilliant! Thank you, It was really helpfull.
Contributor

Re: Time Series Forecasting - Demo

Those is for? because I need predict or forecast piping failures in a Distribution network of drinking water in a determinated place, I hace to put this programming code or i can build this model with the tools of RapidMiner? 

 

Thanks you a lot of...