model problem: finding the 'good' observation in each set of observations

tobiase · October 2014

Hello,

I'm somewhat new to rapidminer and need help / suggestions on how to setup a process for the following problem:

I have a pretty lage dataset (over 1 mio observations): there are 1-20 observations with the same key. Besides that they have 4 more numeric attributes.

Also there is one label (0 or 1) and it's 1 for just one observation in each 'key-group'.
In other words: for each key there is exactly one observation marked with a 1 and all other have a 0.

The challenge is to find that correlation between the other 4 attributes and the label.

If I just use a decision tree, then obviously it doesn't understand that all observations with the same key belong together.

Any advice would be really great - especially how I model that there is just one 'good' observation for each key.

Thank you
Tobias

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

model problem: finding the 'good' observation in each set of observations