I'm concluding an exercise around time series. So far I have explored different ways varying from naive models, STL decomposition, Holt-Winters, Arima which are time-series models. I would like to explore real machine learning models. I have seen a RapidMiner tutorial in relation to windowing which is applying a gradient booster.

1- How does this work precisely?

2- What is the difference with e.g. a neural net model?

3- How are such ML models different than the

@mschmitz

