get pages - error: jave.nio.CharBuffer

currantcurrant Member Posts: 14 Contributor II
edited November 2018 in Help
Hi All,

I am using the get pages-operator and get the following process failed info:
java.nio.CharBuffer.subSequence(II)Ljava/nio/CharBuffer;

for different processes.
I use the 5.3.000-RM Version.

Does anyone know this problem?

Thanx in advance!
Currant

Answers

  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Hi Currant,

    can you please post your process setup?

    Best regards,
    Marius
  • currantcurrant Member Posts: 14 Contributor II
    Dear Marius,

    The process is working again. I often use this process and usually it works. but last Friday and at the weekend I got the error message. On Monday the process worked again although I did not change it...

    here the process:
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.3.005">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.3.005" expanded="true" name="Process">
        <process expanded="true">
          <operator activated="true" class="read_excel" compatibility="5.3.005" expanded="true" height="60" name="Read Excel" width="90" x="45" y="120">
            <parameter key="excel_file" value="C:\temp\Links.xls"/>
            <parameter key="imported_cell_range" value="A1:A518"/>
            <parameter key="first_row_as_names" value="false"/>
            <list key="annotations">
              <parameter key="0" value="Name"/>
            </list>
            <list key="data_set_meta_data_information">
              <parameter key="0" value="Link.true.file_path.attribute"/>
            </list>
          </operator>
          <operator activated="true" class="web:retrieve_webpages" compatibility="5.3.000" expanded="true" height="60" name="Get Pages" width="90" x="179" y="30">
            <parameter key="link_attribute" value="Link"/>
          </operator>
          <operator activated="true" class="text:data_to_documents" compatibility="5.3.000" expanded="true" height="60" name="Data to Documents" width="90" x="315" y="30">
            <list key="specify_weights"/>
          </operator>
          <operator activated="true" class="text:process_documents" compatibility="5.3.000" expanded="true" height="94" name="Process Documents" width="90" x="450" y="30">
            <parameter key="prune_below_rank" value="5.0"/>
            <parameter key="prune_above_rank" value="5.0"/>
            <process expanded="true">
              <operator activated="true" class="web:extract_html_text_content" compatibility="5.3.000" expanded="true" height="60" name="Extract Content" width="90" x="179" y="75">
                <parameter key="extract_content" value="false"/>
              </operator>
              <operator activated="true" class="text:extract_information" compatibility="5.3.000" expanded="true" height="60" name="Extract Information" width="90" x="447" y="30">
                <parameter key="query_type" value="XPath"/>
                <list key="string_machting_queries"/>
                <list key="regular_expression_queries"/>
                <list key="regular_region_queries"/>
                <list key="xpath_queries">
                  <parameter key="ResidueDefinition" value="//h:h1[@style='text-align:left']/text()"/>
                  <parameter key="Footnote1" value="//h:div[@class='col60']/h:ul/h:li[1]/text()"/>
                  <parameter key="Footnote2" value="//h:div[@class='col60']/h:ul/h:li[2]/text()"/>
                  <parameter key="LegisLink" value="//h:div[@class='col40']/h:p/h:a[1]/@href"/&gt;
                  <parameter key="LegisText" value="//h:div[@class='col40']/h:p/h:a[1]/text()"/>
                  <parameter key="Footnote3" value="//h:div[@class='col60']/h:ul/h:li[3]/text()"/>
                  <parameter key="Footnote4" value="//h:div[@class='col60']/h:ul/h:li[4]/text()"/>
                  <parameter key="Footnote5" value="//h:div[@class='col60']/h:ul/h:li[5]/text()"/>
                  <parameter key="Footnote6" value="//h:div[@class='col60']/h:ul/h:li[6]/text()"/>
                  <parameter key="Footnote7" value="//h:div[@class='col60']/h:ul/h:li[7]/text()"/>
                  <parameter key="Footnote8" value="//h:div[@class='col60']/h:ul/h:li[8]/text()"/>
                  <parameter key="Footnote9" value="//h:div[@class='col60']/h:ul/h:li[9]/text()"/>
                  <parameter key="Footnote10" value="//h:div[@class='col60']/h:ul/h:li[10]/text()"/>
                  <parameter key="Footnote11" value="//h:div[@class='col60']/h:ul/h:li[11]/text()"/>
                  <parameter key="Footnote12" value="//h:div[@class='col60']/h:ul/h:li[12]/text()"/>
                </list>
                <list key="namespaces"/>
                <list key="index_queries"/>
              </operator>
              <connect from_port="document" to_op="Extract Content" to_port="document"/>
              <connect from_op="Extract Content" from_port="document" to_op="Extract Information" to_port="document"/>
              <connect from_op="Extract Information" from_port="document" to_port="document 1"/>
              <portSpacing port="source_document" spacing="0"/>
              <portSpacing port="sink_document 1" spacing="0"/>
              <portSpacing port="sink_document 2" spacing="0"/>
            </process>
          </operator>
          <operator activated="true" class="select_attributes" compatibility="5.3.005" expanded="true" height="76" name="Select Attributes" width="90" x="581" y="75">
            <parameter key="attribute_filter_type" value="subset"/>
            <parameter key="attributes" value="|LegisLink|LegisText|Footnote2|Footnote1|Footnote3|Footnote4|Footnote5|Footnote6|Footnote7|Footnote8|Footnote9|Footnote10|Footnote11|Footnote12|ResidueDefinition"/>
          </operator>
          <operator activated="true" class="write_excel" compatibility="5.3.005" expanded="true" height="76" name="Write Excel" width="90" x="581" y="210">
            <parameter key="excel_file" value="C:\Temp\RdEU.xlsx"/>
            <parameter key="file_format" value="xlsx"/>
          </operator>
          <connect from_op="Read Excel" from_port="output" to_op="Get Pages" to_port="Example Set"/>
          <connect from_op="Get Pages" from_port="Example Set" to_op="Data to Documents" to_port="example set"/>
          <connect from_op="Data to Documents" from_port="documents" to_op="Process Documents" to_port="documents 1"/>
          <connect from_op="Process Documents" from_port="example set" to_op="Select Attributes" to_port="example set input"/>
          <connect from_op="Select Attributes" from_port="example set output" to_op="Write Excel" to_port="input"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
        </process>
      </operator>
    </process>
    All links are in an Excel-file, but you could test it also with one link ( http://ec.europa.eu/sanco_pesticides/public/index.cfm?event=substance.info&;id=38 ).

    While searching for a solution of this error message, I found this site: http://mojo.10943.n7.nabble.com/Animal-sniffer-won-t-run-on-jdk-6-td38697.html

    Best wishes
    Currant
  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Looking at your process, it seems that you upgraded RapidMiner in the meanwhile to 5.3.5. Maybe it's that what made your process run again. Please let us know, if the error returns.

    Best regards,
    Marius
  • Jester87Jester87 Member Posts: 10 Contributor II
    I received this error when I tried to use Get Page.

    Here is the Bug Info:
    RapidMiner: 5.3.005
    Parallel Processing: 5.3.000
    Text Processing: 5.3.000
    Weka: 5.3.001
    Web Mining: 5.3.000
    Series: 5.3.000
    Stack trace:
    ------------

    Exception: java.lang.NoSuchMethodError
    Message: java.nio.CharBuffer.subSequence(II)Ljava/nio/CharBuffer;
    Stack trace:
      com.rapidminer.operator.io.web.GetWebpageOperator.read(GetWebpageOperator.java:207)
      com.rapidminer.operator.io.web.GetWebpageOperator.read(GetWebpageOperator.java:61)
      com.rapidminer.operator.io.AbstractReader.doWork(AbstractReader.java:126)
      com.rapidminer.operator.Operator.execute(Operator.java:855)
      com.rapidminer.operator.execution.SimpleUnitExecutor.execute(SimpleUnitExecutor.java:51)
      com.rapidminer.operator.ExecutionUnit.execute(ExecutionUnit.java:711)
      com.rapidminer.operator.OperatorChain.doWork(OperatorChain.java:379)
      com.rapidminer.operator.Operator.execute(Operator.java:855)
      com.rapidminer.Process.run(Process.java:949)
      com.rapidminer.Process.run(Process.java:873)
      com.rapidminer.Process.run(Process.java:832)
      com.rapidminer.Process.run(Process.java:827)
      com.rapidminer.Process.run(Process.java:817)
      com.rapidminer.gui.ProcessThread.run(ProcessThread.java:63)
    Process:
     <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.3.005">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.3.005" expanded="true" name="Process">
        <parameter key="logverbosity" value="init"/>
        <parameter key="random_seed" value="2001"/>
        <parameter key="send_mail" value="never"/>
        <parameter key="notification_email" value=""/>
        <parameter key="process_duration_for_mail" value="30"/>
        <parameter key="encoding" value="SYSTEM"/>
        <parameter key="parallelize_main_process" value="false"/>
        <process expanded="true">
          <operator activated="true" class="web:get_webpage" compatibility="5.3.000" expanded="true" height="60" name="Get Page" width="90" x="179" y="75">
            <parameter key="url" value="http://www.google.com"/>
            <parameter key="random_user_agent" value="true"/>
            <parameter key="connection_timeout" value="10000"/>
            <parameter key="read_timeout" value="10000"/>
            <parameter key="follow_redirects" value="true"/>
            <parameter key="accept_cookies" value="none"/>
            <parameter key="cookie_scope" value="global"/>
            <parameter key="request_method" value="GET"/>
            <list key="query_parameters"/>
            <list key="request_properties"/>
            <parameter key="override_encoding" value="false"/>
            <parameter key="encoding" value="SYSTEM"/>
          </operator>
          <connect from_op="Get Page" from_port="output" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
  • Nils_WoehlerNils_Woehler Member Posts: 463 Maven
    Hi,

    this bug occurs because you are using Java 6. Please install Java 7. Afterwards everything works fine again.

    Best,
    Nils
  • Jester87Jester87 Member Posts: 10 Contributor II
    Nils wrote:

    Hi,

    this bug occurs because you are using Java 6. Please install Java 7. Afterwards everything works fine again.

    Best,
    Nils
    Yeah I just figured this out. I think Apple has stopped supporting Java for Snow Leopard, so I guess this means I have to find a workaround.  >:(
  • Nils_WoehlerNils_Woehler Member Posts: 463 Maven
    Yes Apple does not provide a new Java version anymore. But you can use Orcales Java 7 now: http://java.com/en/download/manual.jsp

    Best,
    Nils
  • Jester87Jester87 Member Posts: 10 Contributor II
    Nils wrote:

    Yes Apple does not provide a new Java version anymore. But you can use Orcales Java 7 now: http://java.com/en/download/manual.jsp

    Best,
    Nils
    Java 7 requires 10.7.3 +

    This means snow leopard users are out of luck.
  • Nils_WoehlerNils_Woehler Member Posts: 463 Maven
    Oh, I didn't know that. I thought Java 7 would be available for all recent OS X versions.
    In that case only OS X users with a version above 10.7.3 are able to use the Web Mining extension...

    Best,
    Nils
  • punitha_c87punitha_c87 Member Posts: 10 Contributor II
    Hi All,

              I am also getting the same error while using get page and get pages for Window 7 OS after installing Java 7 also.
    Please can i know the solution for this
  • Marco_BoeckMarco_Boeck Administrator, Moderator, Employee, Member, University Professor Posts: 1,993 RM Engineering
    Hi,

    it is a bit strange you're getting this error with Windows 7, because we supply a Java 7 JRE with RapidMiner. Are you using RapidMiner 5.3? Or are you using 5.2.8 but updated your Web Mining Extension to 5.3?

    Regards,
    Marco
Sign In or Register to comment.