Options

# Time windows

Hello.

I have a lot of data that I want to pretreat before I use them in machine learning. I have time stamps on all my attributes, and would like to make time windows - all data within 5 min for example, must be assembled, and the mean of these data is the new value. This must then be repeated for all the data I have, so I end up with a lot of windows on 5 min each. Thus, I should like to end up with far fewer data points.

Is this possible in Rapid Miner?

0

## Answers

3,517RM Data ScientistHi Mathias,

yes it is. The key operator is Windowing in the Time Series extension. You can combine this with Generate Aggregation.

~Martin

Dortmund, Germany

20MavenI am not sure that the operator Windowing do what I need.

I have made this muckup of what I'm trying to achieve; http://imgur.com/a/exoJj

Basically, I need to make a window every five minutes, in which all values within this window is assembled and a mean value calculated from these.

3,517RM Data ScientistHi,

isnt that a moving average and you simply use the Moving Average operator?

~Martin

Dortmund, Germany

20MavenWith moving average I end up with the same number of examples - i want to combine all exsamples within a 5 minute window and take the mean value of that window. Then I want to do the same for the next 5 minutes.

So no matter if I got 10 examples or 100 exsamples within a 5 min window, it should all be combined to a single mean value.

20MavenOkay, I have come to the conclusion that I can use the operator Windowing and generate aggregation.

Now I divide column A in 180 new columns, then I use generate aggregation to find the mean value of the 180 columns. However, I also need to know whether the value is increasing or decreasing in the 180 columns, and I can not get generate aggregation to do this?

Like what LINEST do in excel.

Is there amy way I can do this?

Thank you in advance!

3,517RM Data ScientistSo, you would like to fit a line on it and get the coefficents? Thats one of the extract operators of time series.

Dortmund, Germany

20MavenI've tried the operator Extract Coefficients, but requires series data where I work with ExsampleSet. I've tried to convert the data with the operator Data to series, but then I can only use one attribute (and I now have 180 attributes due to operator windowing)

1,635UnicornI think the "Fit Trend" operator from the Series extension will do what you are looking for--it uses an inner modeling operator, so if you want a simple linear estimate you just use the linear regression model, or you can use other modeling algorithms if you want something more complex.

Lindon Ventures

Data Science Consulting from Certified RapidMiner Experts