Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
Supervised learning data prep for sport prediction
Hi All,
I've got my sports data and looking to build a model for predicting the outcome of sport games.
My data has a single row for each team ie.
Team StatA StatB StatC
A 4 6 8
B 3 9 8
C 4 6 5
.....
Now in my data team A plays team B, Team C plays Team D etc...
Now there's two ways I can do this, the first is a matched pairs (Can rapidminer do this?) so my data would look like
ID Team StatA StatB StatC
G1 A 4 6 8
G1 B 3 9 8
G2 C 4 6 5
Then you tell the program ID is the matched ID field so it knows the first row and second row is a matched pair and builds the model accordingly
Or the other way is to transform the data into one row like this,
AStatA AStatB AStatC HStatA HStatB HStatC
4 6 8 3 9 8
So now all my data from both teams in a single match is on a single row, and build the model this way.
Can I get pros and cons for each? will it yield the same result and is the first matched pair even possible? (I know it is in SAS)
I've got my sports data and looking to build a model for predicting the outcome of sport games.
My data has a single row for each team ie.
Team StatA StatB StatC
A 4 6 8
B 3 9 8
C 4 6 5
.....
Now in my data team A plays team B, Team C plays Team D etc...
Now there's two ways I can do this, the first is a matched pairs (Can rapidminer do this?) so my data would look like
ID Team StatA StatB StatC
G1 A 4 6 8
G1 B 3 9 8
G2 C 4 6 5
Then you tell the program ID is the matched ID field so it knows the first row and second row is a matched pair and builds the model accordingly
Or the other way is to transform the data into one row like this,
AStatA AStatB AStatC HStatA HStatB HStatC
4 6 8 3 9 8
So now all my data from both teams in a single match is on a single row, and build the model this way.
Can I get pros and cons for each? will it yield the same result and is the first matched pair even possible? (I know it is in SAS)
0