RapidMiner

RapidMiner

RapidAnalytics - CookieHandler.getDefault() returns null

Regular Contributor

RapidAnalytics - CookieHandler.getDefault() returns null

We recently upgraded the JRE SE (from version 1.6 to version 1.7.0_45) on our RapidAnalytics servers, and since then we have started having a very odd problem with the Get Page and Get Pages operators. RapidAnalytics version 1.3.013 is running on a Windows Server 2008R2. This time we actually installed the JRE using the "Server JRE" tar file, instead of using the installation program, but I am not sure if this has anything to do with this or not.

On line 89 of the HttpURLConnectionProvider class, the call CookieHandler.getDefault() is returning a null, which then makes the operator fail a couple of lines after this. But interestingly enough, if we restart the service everything works fine for a while (the call returns a cookie manager), but a couple of executions later the problem happens again.

Has anybody experienced a similar problem? Any idea of what else we can look into?

This is an example process with which I am able to reproduce the problem:


<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.013">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.3.013" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="web:get_webpage" compatibility="5.3.001" expanded="true" height="60" name="Get Page" width="90" x="246" y="75">
        <parameter key="url" value="http://elmundo.es"/>
        <list key="query_parameters"/>
        <list key="request_properties"/>
      </operator>
      <connect from_op="Get Page" from_port="output" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>


These are the top lines of the stack trace:


SEVERE: Process failed: java.lang.NullPointerException
java.lang.NullPointerException
at com.rapidminer.operator.io.web.HttpURLConnectionProvider.getConnectionInstance(HttpURLConnectionProvider.java:106)
at com.rapidminer.operator.io.web.GetWebpageOperator.read(GetWebpageOperator.java:136)
at com.rapidminer.operator.io.web.GetWebpageOperator.read(GetWebpageOperator.java:61)
at com.rapidminer.operator.io.AbstractReader.doWork(AbstractReader.java:126)
at com.rapidminer.operator.Operator.execute(Operator.java:867)
at com.rapidminer.operator.execution.SimpleUnitExecutor.execute(SimpleUnitExecutor.java:51)
at com.rapidminer.operator.ExecutionUnit.execute(ExecutionUnit.java:711)
at com.rapidminer.operator.OperatorChain.doWork(OperatorChain.java:375)
at com.rapidminer.operator.Operator.execute(Operator.java:867)


Thanks!
5 REPLIES
Regular Contributor

Re: RapidAnalytics - CookieHandler.getDefault() returns null

Hi again,

In case it helps anyone, it looks like we may have solved the problem by adding the following lines to the HttpURLConnectionProvider class.

Thanks,
Miguel


...
public static HttpURLConnection getConnectionInstance(ParameterHandler handler, URL url) throws UndefinedParameterError, IOException {
MultiThreadedCookieManager cookieManager = (MultiThreadedCookieManager) CookieHandler.getDefault();
if(cookieManager == null) {
cookieManager = new MultiThreadedCookieManager();
CookieHandler.setDefault(cookieManager);
}

if(handler.getParameterAsInt(PARAMETER_ACCEPT_COOKIES)!=ACCEPT_NO_COOKIES) {
...
Regular Contributor

Re: RapidAnalytics - CookieHandler.getDefault() returns null

After applying this fix now we are also getting this other error in some cases, not always:


java.lang.ClassCastException: java.net.CookieManager cannot be cast to com.rapidminer.operator.io.web.MultiThreadedCookieManager
at com.rapidminer.operator.io.web.HttpURLConnectionProvider.getConnectionInstance(HttpURLConnectionProvider.java:89)
at com.rapidminer.operator.io.web.GetWebpageOperator.read(GetWebpageOperator.java:136)
at com.rapidminer.operator.features.construction.RetrievePagesOperator.doWork(RetrievePagesOperator.java:161)
at com.rapidminer.operator.Operator.execute(Operator.java:867)
at com.rapidminer.operator.execution.SimpleUnitExecutor.execute(SimpleUnitExecutor.java:51)
at com.rapidminer.operator.ExecutionUnit.execute(ExecutionUnit.java:711)


Which I am guessing that some where in the code the CookieHandler is being initialized with a CookieManager instead of a MultiThreadedCookieManager. Could that be?
Regular Contributor

Re: RapidAnalytics - CookieHandler.getDefault() returns null

I think this problem only happens in the RapidAnalytics server because the program is using the system-wide cookie manager and it is probably shared by the operators and RapidAnalytics server code. The operator is using a MultiThreadedCookieManager and I am guessing RapidAnalytics is using a CookieManager.

What would be the best way to solve this issue? Create a new MultiThreadedCookieManager every time in the Get Pages operator?

Super Contributor

Re: RapidAnalytics - CookieHandler.getDefault() returns null

Hi Miguel,

RapidAnalytics does not officially support Java 1.7. It will be supported with the next release. I move this topic into the Development forum, I think that's a better place.

Best regards,
Marius
Regular Contributor

Re: RapidAnalytics - CookieHandler.getDefault() returns null

Hi Marius,

I see, I didn't know that Java 1.7 wasn't supported..., the only reason why we upgraded the server recently is because we started experiencing this issue using RapidAnalytics (http://rapid-i.com/rapidforum/index.php/topic=6254.0), and I have to say that upgrading Java from 1.6 to 1.7 actually got rid of this problem, but now we have different one Smiley Happy

Thanks for relocating the post.

Cheers,
Miguel