Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
model problem: finding the 'good' observation in each set of observations
Hello,
I'm somewhat new to rapidminer and need help / suggestions on how to setup a process for the following problem:
I have a pretty lage dataset (over 1 mio observations): there are 1-20 observations with the same key. Besides that they have 4 more numeric attributes.
Also there is one label (0 or 1) and it's 1 for just one observation in each 'key-group'.
In other words: for each key there is exactly one observation marked with a 1 and all other have a 0.
The challenge is to find that correlation between the other 4 attributes and the label.
If I just use a decision tree, then obviously it doesn't understand that all observations with the same key belong together.
Any advice would be really great - especially how I model that there is just one 'good' observation for each key.
Thank you
Tobias
I'm somewhat new to rapidminer and need help / suggestions on how to setup a process for the following problem:
I have a pretty lage dataset (over 1 mio observations): there are 1-20 observations with the same key. Besides that they have 4 more numeric attributes.
Also there is one label (0 or 1) and it's 1 for just one observation in each 'key-group'.
In other words: for each key there is exactly one observation marked with a 1 and all other have a 0.
The challenge is to find that correlation between the other 4 attributes and the label.
If I just use a decision tree, then obviously it doesn't understand that all observations with the same key belong together.
Any advice would be really great - especially how I model that there is just one 'good' observation for each key.
Thank you
Tobias
0