I am working on a project and need some guidance across the board from data-set training to model selection. I am trying to see if I can predict how any given member of Congress in the USA will vote on a bill that gets introduced based on financial influence. All financial data is available from opensecrets.org, sunlightfoundation.com, thomas.loc.gov, etc. I plan on doing so by analyzing any given member's financials and looking at the companies that have donated money to them. I need to be able to analyze a bill and assign an attribute to it that will be able to directly correlate it to a particular industry (possible through word text mining). With that being said, the next step would be to see if the bill that is being introduced would adversely affect the financial interests of the company from that industry, which would then give us the ability to predict on how the person will vote (assuming that they will vote in favor of their financial sponsor). I am doing this project to see if there is a blatant trend in making such decisions based on mere financial gain rather, rather than popular support. I am pressed for time to complete this for school so if anyone can give me some guidance as to setting up the proper testing framework, your help will be much appreciated and noted. Below are sample data-sets far for training purposes (still need one for the bills). Please advise.