Access e-mails with the Text Processing Extension
Nowadays e-mails might be the most common source of text you encounter every day. They are also a common source of data for text mining applications, e.g., spam detection or sentiment analysis.
Having a direct access to an e-mail account and processing mails automatically can be a great boon when putting text mining tasks into production. New incoming mails can be processed and their content scored, which can even trigger more reactions.
Read Documents (Mail)
The Text Processing Extension available at the RapidMiner marketplace includes an Operator that allows exactly that:
Read Documents (Mail).
It provides access using the common IMAP protocoll or the older POP3 standard. All you need to know is the host address, the log-in data and the used protocol. Also quite important is the folder parameter, as you seldom want to download the complete inbox folder.
Unfortunately it is not always as simple as it could be. Some e-mail provider requires you to set additional parameters to access their services and it is sometimes hidden at their sites.
Below are the settings for the two most common providers: Microsoft Outlook (Office365) and Gmail.
Similar is the Process Documents from Mail Store Operator, which requires the same properties settings. Whith this Operator it is possible to directly work on single e-mails. The Operator also allows to download attached files