influence of adding last index to windows attribute - time series data.

Thiru · August 2020

dear all, im working on a time series data. refer the enclosed process.

1. currently - Im generating features using 'process windows' and extract aggregate as sub process. The extracted features are given to train my machine learning model.
2. Ive noticed - by choosing yes for 'adding last index to windows attribute' in the parameter of process windows operator, improves the performance of the model drastically. i.e. from 67% accuracy to 97% accuracy. Ive noticed the difference is adding one extra column in the generated features column. I' m not able to get this point of how this influence the performance of the model.

Is it correct to consider this performance of 97% & can anyone help to understand the role of adding last index. thanks.

regds
thiru

jacobcybulski · August 2020

As I have no access to your data I cannot replicate it exactly. The last index in window attribute is special and is added only so that you could retain the index in the new example set (as an ID). Note however that since you aggregate your time series and you do not use any of the special attributes (except for the label), the last index vanishes anyway. So there is no impact on the result. You must have changed something else in your process. You may have got the random effect from a different mix of data coming on different runs - to eliminate this set the random seed in Split Data and Cross Validation operators and see if you still get the amazing performance on two runs. Also try simplifying your process (e.g. remove your stacked ensemble) to isolate the effect.
Jacob

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

influence of adding last index to windows attribute - time series data.

Answers