Windowing Operator: Is this how it works?

amundamund Member Posts: 22 Maven
edited November 2018 in Help

Hi,

 

I'm trying to use the Windowing and Sliding Window Validator to predict future values. I've watched Thomas Ott YouTube video and looked at other posts in the forum, but I am still not confident using these operators so I'd like to ask some questions. I want to look at the settings at a very basic level to understand how to use them.

 

Let's say I have 1000 examples in my training set that covers 1000 days of a stock price. Is my understanding here correct?

 

First, The Windowing operator:

 

Window size: This is the number of days RapidMiner (RM) will use to predict the future value. If I set it to 10, RM will use 10 days of data to predict the future value. For example (let's not think about holidays and weekends), it will use Jan 1 -> Jan 10 to predict Jan 11.

 

Step Size: Decides which values to skip, or step over. If the step size is 7, RM will only use the values of Jan 1, 8, 15 etc. The skipped values will be left out and not used for predictions. It is the same as creating a new dataset with the first day of every week, setting step size to 1.

 

Create label: Here I choose the attribute I want to predict. I set it to "Yes" and chose the closing price attribute.

Here, we also have to set the horizon. Let's say my Window size is 10, step size is 1. If horizon is set to 1, RM will use the values of Jan 1 - Jan 10 to predict the value of Jan 11. If horizon is set to 5, RM will use the values of Jan 1 to Jan 10, to predict the value of Jan 15. Is that right?

 

Now on to The Sliding Window Validation operator.

 

Now, as far as I understand, the validator does not improve the model in itself. It is simply a tool to validate whether or not the model I have created is performing well. The results from the validator can be used to understand the model better and optimize it. Correct?

 

In the validator I find these settings.

Training Window Width

Training Window Step Size

Test Window Width

Horizon

 

Here, I am not quite sure what to do. Should these settings simply correspond to the settings in the Windowing operator? I believe this is not the right answer.

 

Following my previous examples, could we create similar examples for these settings to put it into context?

Tagged:

Best Answer

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn
    Solution Accepted

    The Sliding Window Validation is used for backtesting. Once you've windowed your data, it will slide across your time series in a defined way and train on a window then try to test it on another window.

     

    The training window width is just that, that's how many time units (width) you want your model to be trained on. The Testing window is your out of sample data in the time series where it tests the model and measures the performance. The steps side is how many time units you slide the window ahead. The horizon is just the time unit space between the Training and Test windows. 

     

    Does this help?

Answers

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    Did you check out this thread post? I go into pretty deep detail on the Windowing operator. 

     

    http://community.rapidminer.com/t5/RapidMiner-Studio-Forum/Time-Series-using-Windowing-operator-in-RapidMiner/m-p/31791

  • amundamund Member Posts: 22 Maven

    Yes, I've read this post and it's very good on the windowing operator. So I just wanted to confirm that my understanding of this operator was correct. I guess my question is really about the sliding window validator and how the settings work in relation with the windowing operator.

     

    As far as I understand the settings doesn't affect the model, only test the performance, right?

     

    Would it be right to say that setting a larger window width in the window validator is comparable to reducing number of folds in an x-validator?

     

    Training step size and horizon is still unclear to me.

     

  • amundamund Member Posts: 22 Maven

    Ok, so the validator basically takes a window of the windowed data, tests it and moves on to the next, sliding through the data till the end of the example set? (The validator window size is not really related to the Windower window size)

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    That's correct. The Windowing operator helps you create the cross section of your data. The Sliding Window Validation operator takes that window and creates a Training/Testing Window on top of that. 

  • amundamund Member Posts: 22 Maven

    Great - then I understand. Thanks!

  • fungayimfungayim Member Posts: 7 Contributor II
    edited December 2019
    Hie.Unicorn
    this question by Amund is very interesting. i have learned also from it. Thanks.

    I have question, with window size 10 and step size 1, can i calculate the number of output attributes.
    Thanks.
Sign In or Register to comment.