🎉 🎉 RAPIDMINER 9.10 IS OUT!!! 🎉🎉
Download the latest version helping analytics teams accelerate time-to-value for streaming and IIOT use cases.
How to model a large data set with different "components"?
I was hesitant to post this because i thought for sure it would be covered. But thus far in the existing videos and posts i haven't seen my specific question covered (even if some similar subject matter has been discussed).
I am modeling sports data. Baseball and football for daily fantasy sports purposes. Overall Direction Is: i want to build a baseline projection model (time series/pattern recognition), and then add some form of regression to that baseline projection in order to account for game specific matchup variables.
My question is less about the types of models or the theory, but more basic: How can i create models for individual players from a dataset including multiple players in the easiest way possible.
I want to build both models (projection and regressions) based on player specific data. I don't want to create models for "all third basemen" or "all running backs". I want to create them specific to individual players. However, i don't want to save individual data files for each player. That process, while likely not to hard with some engineering, seems like a waste of time. There has to be a better way.
I have large data sets with all the variables and historical data tied to individual players for individual games connected and organized. It would read something like (Date-game specific; player ID; Team ID; Points scored, then all the stats and situational variables related to that game). Each player has their own line for a specific game/date.
How would someone with more experience suggest i set up my process, or leverage certain models, which can provide me player specific results from a single run through a larger data set?
From my research i have a hunch that a macro and loop setup could possibly be used to limit my overall data to a player specific set of examples based on the macro list. But is there a better, more streamlined way?
Last note - my question (again) is less about using specific operators. I have used single player data sets with success using the instruction for time series, regression, and SVMs (THANKS THOMAS OTT). Now i need the best way to move from single player datasets to larger data sets. I will need to update these daily or weekly - hence my quest for simplicity if possible.
Thanks (and sorry if i am in the wrong place with a bad question)