that's great. Thank you! Please 1) sign up with DrivenData, 2) join the Pover-T competition, and 3) send me your username. I will add you to the team.
@sgenzer My username is JEdward. Add me in when you get a chance.
FYI @mschmitz has given me his latest effort that I will post this week. Hopefully our rankings will rise! I will also post his work on the server.
@SvenVanPoucke - sent you a PM.
I've created some processes to do some ETL to read the individual and household data together. If you feel the need to do this, my processes might save a bit of time. I've done extensive dummy encoding to make large numbers of attributes and spent effort to make sure the attributes match in the test and training sets. There is extensive use of collections to try and minimise the administrative burden of making changes in multiple places (although it's a bit of a monster).
Hopefully the've attached to this post. They are V8 RapidMiner
There are 4 in total, the 2 "preprocess" ones handle household and individual data, the "merge" one calls these and the "save" one calls the "merge" one to save files.