We have TONS of videos to help you learn RapidMiner - from beginner to expert. Click to watch!
We're competing as Team "RapidMiners" in DrivenData's latest competition: "Pover-T". Join now!
Read about how our community works. Meet other newbies. Get your questions answered fast!
Recently, i upgraded from rapidminer v7.2 to 7.3. After the upgrade, the radoop throws java.util.concurrent.TimeoutException while connecting to Hive server 2. In another rapidminer installation (v7.2), the same configuration works fine.
Current config details:
Hadoop version: Apache Hadoop 2.2+
Hadoop user name: hadoop
Hive Server2 (Hive 0.13 or newer)
Are there any configuration changes to be made in radoop for v7.3? I have tried with rapidminer v7.3 + radoop 7.2 as well as rapidminer v7.3 + radoop 7.3. Both of them does not work. Please help.
It would be a bit surprising, if Studio 7.2 and 7.3 behaved differently with the same Radoop version. (So it is valuable, if we find such a case. ) Can you reproduce this behaviour consistently?
I'll copy my answer on how to move on with the problem from another topic.
The error states that there were no response from the HiveServer2 instance (specified by either the Master Address or the Hive Server Address fields, and the Hive Port) in a given time.
I would try the following:
Thanks peter for the response.
Yes. The behaviour is reproducible consistently. Yesterday, i tried creating a Amazon EMR cluster and tried connecting through Radoop. The same issue persists even if I open all inbound ports in the EMR master instance.
All URLs (namenode, history server, spark etc.) are accessible remotely. Only the hiveserver connection fails. Tried increasing the timeouts earlier upto 4minutes, but no luck. Hive works through beeline (tested this locally on the cluster).
Let me know if there are any other tests I can try out.
Figured out the issue and resolved it.
The problem is the change made in Rapidminer v7.3 in the system -> preferences option. Earlier under system, one has to explicitly specify HTTP proxy and by default, it's no proxy. In the new version, the proxy is a separate option (under system->preferences) and by default it's set to 'System proxy'. Once i changed it to Direct (no proxy), it worked fine. I think the default option should be no proxy.
Sharing this as it might help others who might face similar issues due to upgrade.