Need some help in the application "Process documents from Mail Store"

subhasisdasguptsubhasisdasgupt Member Posts: 15 Contributor II
I love this software and I am still in the exploration mode to understand the true potential of this awesome data mining software. After the release of RM 5.3, I was trying to use the feature "Process documents from mail store" to extract mail information from my google account. I enabled the POP protocol in the mail setting and provided all the connection properties to access my mail box through RM. It worked for the first time. But from next time onward the same process is extracting nothing even after unchecking the "Only Unseen" check box. I also tried to extract mail info from other mail folders but every time RM gave an error "Folder is not INBOX" (perhaps this is a limitation as of now). I am putting the XML below

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.000">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.3.000" expanded="true" name="Process">
    <process expanded="true" height="235" width="212">
      <operator activated="true" class="text:process_mail_documents" compatibility="5.3.000" expanded="true" height="76" name="Process Documents from Mail Store" width="90" x="45" y="30">
        <parameter key="host" value="pop.gmail.com"/>
        <parameter key="user" value="[email protected]"/>
        <parameter key="password" value="2mL+AExnqBsRWegwOE5qdw=="/>
        <list key="connection_properties">
          <parameter key="mail.pop3.port" value="995"/>
          <parameter key="mail.pop3.ssl.enable" value="true"/>
          <parameter key="mail.pop3.timeout" value="5000"/>
          <parameter key="mail.pop3.connectiontimeout" value="5000"/>
        </list>
        <parameter key="protocol" value="pop3"/>
        <parameter key="mark_seen" value="false"/>
        <process expanded="true" height="446" width="729">
          <operator activated="false" class="web:extract_html_text_content" compatibility="5.3.000" expanded="true" height="60" name="Extract Content" width="90" x="45" y="30"/>
          <operator activated="true" class="text:tokenize" compatibility="5.3.000" expanded="true" height="60" name="Tokenize" width="90" x="246" y="30"/>
          <operator activated="false" class="text:filter_stopwords_english" compatibility="5.3.000" expanded="true" height="60" name="Filter Stopwords (English)" width="90" x="447" y="30"/>
          <connect from_port="document" to_op="Tokenize" to_port="document"/>
          <connect from_op="Tokenize" from_port="document" to_port="document 1"/>
          <portSpacing port="source_document" spacing="0"/>
          <portSpacing port="sink_document 1" spacing="0"/>
          <portSpacing port="sink_document 2" spacing="0"/>
        </process>
      </operator>
      <operator activated="false" class="text:read_documents_mail" compatibility="5.3.000" expanded="true" height="60" name="Read Documents (Mail)" width="90" x="112" y="165">
        <parameter key="host" value="imap.gmail.com"/>
        <parameter key="user" value="[email protected]"/>
        <parameter key="password" value="2m+AExnqBsWegwOE5qdw=="/>
        <list key="connection_properties">
          <parameter key="mail.imap.port" value="993"/>
          <parameter key="mail.imap.ssl.enable" value="true"/>
          <parameter key="mail.imap.timeout" value="50000"/>
          <parameter key="mail.imap.connectiontimeout" value="50000"/>
        </list>
        <parameter key="protocol" value="imap"/>
        <parameter key="mark_seen" value="false"/>
        <parameter key="folder" value="inbox"/>
      </operator>
      <connect from_op="Process Documents from Mail Store" from_port="example set" to_port="result 2"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <portSpacing port="sink_result 3" spacing="0"/>
    </process>
  </operator>
</process>

the password is changed in this XML to avoid any unwanted access. Kindly provide me any suggestion how to use the same process for extracting mail information which the software extracted earlier also.

Thanks
Subhasis

Answers

  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Hi Subhasis,

    did you check with another software (or the mail client), if the mail is still in your inbox?
    You may have better success when switching to the imap protocol - the pop protocol is somewhat outdated, and as for as I know has no good support for folders.

    Best regards,
    Marius
Sign In or Register to comment.