"Window Optimization for Multivariate Time Series Analysis"

jqford · August 2008

I would like to optimize my windowing scheme for a multivariate time series analysis.

An important feature I would like to explore in my analysis is whether the window size should be optimized on an individual basis for each attribute. For example, my data suggests that the past five measurements of "Attribute A" are relevant to future predictions, but only the past three measurements of "Attribute B" are important. I am concerned that the use of two extra measurements of "Attribute B" by the learner will leads to increased forecasting errors.

I have noted that MultivariateSeries2WindowExamples only permits a single window size in its transposition of data. Using the example above, this would lead to the inclusion of additional measurements of "Attribute B", which would negatively impact my forecasting performance.

Is there a way to specify a unique window sizes for each attribute within MultivariateSeries2WindowExamples? Are there any other tools / processes that would allow me to optimize window sizes for individual attributes?

In addition, I am wondering whether anyone else has used such a scheme. I am sure there is a paper out there somewhere on this issue - but I have not located it. Any suggestions? I'm not afraid to do some dusty-book research on this if needed.

Thanks in advance for any input!

Josh

TobiasMalbrecht · August 2008

Hi Josh,

including the "extra" (as you call it) lagged values does not necessarily lead to negatively impacted forecasting performance, as the applied learner usually decides on which attributes it will use in the model it builds. Hence, if the learner decides the lagged values have an influence on the label, it will include the corresponding attributes in the models, otherwise it will not or at least give them not that much importance.

Nevertheless (if you want to delete the lagged values) from the example set, you of course may do so after having applied the [tt]MultivariateSeries2WindowExamples[/tt] by applying a [tt]FeatureNameFilter[/tt] afterwards, which removes the corresponding attributes from the example set before a learner is applied.

Hope that helps,
Tobias

jqford · August 2008

That's very helpful!

Thanks Tobias.

Josh

haddock · August 2008

Hi Josh,

Sounds like we are mining the same sort of stuff; I use about 50 indicators, each with a lookback range of 30 days, giving 1500 attributes. If you enclose a sliding window validation within a parameter optimisation you can tweak the training window size along with everything else, in order to see what your lookback should be.

Tobias is right, I know it sounds idle to shove stuff in regardless, but that is what RM loves, and you can always defend your action by saying that you are scrupulous about avoiding bias!

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

"Window Optimization for Multivariate Time Series Analysis"

Answers