Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

"Process documents from mail vs gmail"

jombreejombree Member Posts: 2 Contributor I
edited June 2019 in Help
I am trying to process e-mails from a gmail account. I can't figure out how to configure the 'process documents from mail' operator to access the mailbox, however.
Could you please help me with giving a working example or any hints?
I'll need to do the same for an exchange server mailbox afterwards.
Thank you, J.

Answers

  • ReneRene Member Posts: 24 Contributor II
    Interesting. I was playing around with it yesterday and I guess I had
    the same questions as you. ;-)  This process connects to gmail,
    leeches mails and decodes the subject lines:
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.1.006">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.1.006" expanded="true" name="Process">
        <process expanded="true" height="280" width="547">
          <operator activated="false" class="text:read_documents_mail" compatibility="5.1.001" expanded="true" height="60" name="GMX" width="90" x="45" y="30">
            <parameter key="host" value="pop.gmx.net"/>
            <parameter key="user" value="username@gmx.de"/>
            <parameter key="password" value="password"/>
            <list key="connection_properties">
              <parameter key="port" value="995"/>
            </list>
            <parameter key="protocol" value="pop3"/>
            <parameter key="only_unseen" value="false"/>
            <parameter key="mark_seen" value="false"/>
          </operator>
          <operator activated="true" class="text:read_documents_mail" compatibility="5.1.001" expanded="true" height="60" name="Gmail" width="90" x="45" y="120">
            <description>POP3 message store:
    http://javamail.kenai.com/nonav/javadocs/com/sun/mail/pop3/package-summary.html<;/description>
            <parameter key="host" value="pop.googlemail.com"/>
            <parameter key="user" value="username"/>
            <parameter key="password" value="password"/>
            <list key="connection_properties">
              <parameter key="mail.pop3.port" value="995"/>
              <parameter key="mail.pop3.ssl.enable" value="true"/>
              <parameter key="mail.pop3.timeout" value="5000"/>
              <parameter key="mail.pop3.connectiontimeout" value="5000"/>
            </list>
            <parameter key="protocol" value="pop3"/>
            <parameter key="mark_seen" value="false"/>
          </operator>
          <operator activated="true" class="text:documents_to_data" compatibility="5.1.001" expanded="true" height="76" name="Documents to Data" width="90" x="179" y="120">
            <parameter key="text_attribute" value="mail"/>
          </operator>
          <operator activated="true" class="execute_script" compatibility="5.1.006" expanded="true" height="76" name="decodeSubject" width="90" x="313" y="120">
            <parameter key="script" value="import javax.mail.internet.*;&#10;ExampleSet exampleSet = operator.getInput(ExampleSet.class);&#10;for (Example example : exampleSet) {&#10;&#9;example[&quot;Subject&quot;] = MimeUtility.decodeText(example[&quot;Subject&quot;]);&#10;}&#10;return exampleSet; &#10;"/>
          </operator>
          <connect from_op="Gmail" from_port="output" to_op="Documents to Data" to_port="documents 1"/>
          <connect from_op="Documents to Data" from_port="example set" to_op="decodeSubject" to_port="input 1"/>
          <connect from_op="decodeSubject" from_port="output 1" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    At least it did it for me ...

    greetings,
    rené
  • jombreejombree Member Posts: 2 Contributor I
    Thank you René, it works like a charm. Do you have any idea how to do the same with an MS exchange server?
  • ReneRene Member Posts: 24 Contributor II
    Hm no, unfortunately not - I got no experience with ms exchange servers. But here it says:
    The server addresses (POP, IMAP, SMTP) will need to be configured with the IP address of your Exchange server. [...]. he username has to be specified as DOMAIN\USERNAME\MAILBOX in the case of a POP3 configuration, or DOMAIN/USERNAME/MAILBOX in the case of IMAP.[...] The password field may be configured with the users NT Domain Account password if you don't want to type it in each time you connect. The email address that you supply will be the Internet email address associated with the mailbox.
    Does this work?
  • JALSJALS Member Posts: 1 Learner III
    Hi, thanks for the information. I'm a new user in RapidMiner and I have the same doubt over how configure the "Process Documents from Mail Store" for the  HOST and USER fields to apply in GMAIL and Outlook web Access. Please i need help and a basic explanation (I am no expert in SQL). Thanks a lot
Sign In or Register to comment.