Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
Maintaining Data Resolution with in-equal sampling frequencies
Hello, I was given a data set in which the label which I want to predict is sampled between 1-6 hours but the inputs which effect it are sampled every hour. My first thought would be to average the inputs to match the sampling frequency of the output but I was wondering if there was any way to retain all the data incase something like variation rather than just the average value effects the output. I have attached a scrubbed version of the data set if you would like to take a look. Thanks!
0
Best Answer
-
Telcontar120 RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 UnicornYou could retain all the data but you'll need to pivot it so you still end up with only one row per prediction instance (e.g., every 6 hours). Basically you would be creating extra attributes for the additional sample points. Once you have done that you could look at doing additional feature generation to capture things like min/max in the sample window, range, std deviation, or other measures of dispersion in the additional sample points. Take a look at some of the function options in Generate Aggregation for additional ideas.
5