πŸŽ‰ πŸŽ‰. RAPIDMINER 9.8 IS OUT!!! πŸŽ‰ πŸŽ‰

RapidMiner 9.8 continues to innovate in data science collaboration, connectivity and governance

CLICK HERE TO DOWNLOAD

Question abut: Sliding Window Validation...

rapidcrazyrapidcrazy Member Posts: 2 Contributor I
edited November 2018 in Help
Hello folks, I would like to know something about the operator: Sliding Window Validation

1- The parameter: "training window step size" says it is: "Number of examples the window is moved after each iteration"
but that number of examples are from the begining of the training window or from it's end?

2- The parameter: "horizon" says it is "Increment from last training to first testing example", but when it says: "last training" it is from the begining of the training wiindow or from it's end?

3- Is there (or could you make) some graphic which helps to understand the above two points better?. I think a graphic could help a lot.

4- In one example I saw on internet, had the configuration:
training window width: 20
training window step size: 5
test window width: 20
horizon: 5

I noticed that the "training window step size" and the "horizon" have the same values. Is it common set the same values for these two parameters?, if so, why?.

5- What means the nomenclature: "Range: integer; -1-+?" ?
I have never seen that nomenclature in my life

6- The parameter: "cumulative training" says it is "Indicates if each training window should be added to the old one or should replace the old one". I don't understand this too much. Could you explain this with other words?

Thanks in advance, Rapidcrazy.

Answers

  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869   Unicorn
    Hi Rapidcrazy,

    the documentation is quite clear in most cases.
    1. The beginning of the test window is moved by x steps in each iteration to the right, i.e. from the beginning.
    2. The last training example is at the end of the training window :)
    4. The coincidence of the two values is pure chance, I would guess. At least I don't see any reasoning for setting both parameters to the same value in the general case.
    5. This nomenclature is a bit buggy, thank you for pointing this out! It wants to say that the lowest allowed value is -1, the upper end of the range is positive infinity.
    6. If this parameter is activated, not only the examples in the current training window are used, but all examples from the beginning of the dataset up to the end of the training window are used.

    Best regards,
    Marius
  • MWMMWM Member Posts: 1 Contributor I

    Hi!.

    I have recently started to use the Time Series Extension. However I am quite confused about how the sliding validation works. The reason for this confusion is the number of rounds the learner seems to report while running the validation.

    If for example, I have a data set of 1000 examples, and I set the Training Window to 500, and testing to 10, I am expecting:

    - The first run will conduct training from example 491 to 990, and test on example 991;
    - The second run should train the model from 492 to 991, and test on example 992; etc.
    - The Final run should train from example 500 to 999, and test on example 1000.

    Therefore the validation should run 10 training cycles of 500 examples (5000 total), However the software only seems to count up to a few hundred examples (always less than 1000 of course).

    Can you please explain to me what I have missed in this process?

    Thanks!.
  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869   Unicorn
    The Sliding Window Validation with your settings will first train on ex.1-500 and test on 501 to 510, then train on 501-1000, test on 1001-1010, then train on 1001-1500 and so on.

    I don't understand the last sentence of your questions, but I hope my answer gets it solved nevertheless :)

    Best regards,
    Marius
Sign In or Register to comment.