RapidMiner

Enron Email Dataset

by Community Manager on ‎07-12-2016 06:20 AM

http://www.cs.cmu.edu/~enron/

 

All you text miners - this is the classic dataset. This data was originally made public, and posted to the web, by the Federal Energy Regulatory Commission during its investigation.

 

Some young whippersnapper in the office asked me who Enron were recently - oh how time flies.

Comments
robin
Regular Contributor

How would you recommend reading this data set in? I have been playing with it for a number of years now and it has been sitting in my archives since 2015. 

 

Problem I have is that each of the markers that could be used to define the fileds are present in the text as well. As an example the "To:" field would be one of the fields that one would want to extact from the data, however this filed is also present in the mail body. 

Contributors