RapidMiner

Radoop Connection: Full Test Error

Hi,

I have a CDH5 4 Nodes cluster that I want to connect to using radoop and rapidminer. I am getting the error below when running distributed file system upload:
[May 19, 2015 5:33:09 PM]: Distributed file system upload started.
[May 19, 2015 5:37:10 PM] SEVERE: Test data upload to the distributed file system timed out. Please check the NameNode and DataNode services and their logs for traces of errors.

I have opened all necessary ports, for your information.

Thanks in advance for your help.


Below please find the full test log:
[May 19, 2015 5:32:58 PM]: Connection test for 'master.example.com' started.
[May 19, 2015 5:32:58 PM]: Hive server 2 connection (???.???.???.??:10000) test started.
[May 19, 2015 5:32:59 PM]: Hive server 2 connection test succeeded.
[May 19, 2015 5:32:59 PM]: Retrieving required configuration properties...
[May 19, 2015 5:33:00 PM]: Successfully fetched property: yarn.resourcemanager.scheduler.address
[May 19, 2015 5:33:00 PM]: Successfully fetched property: yarn.resourcemanager.resource-tracker.address
[May 19, 2015 5:33:00 PM]: Successfully fetched property: yarn.resourcemanager.admin.address
[May 19, 2015 5:33:00 PM]: Successfully fetched property: yarn.application.classpath
[May 19, 2015 5:33:00 PM]: MapReduce Home added to yarn.application.classpath ($HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*)
[May 19, 2015 5:33:00 PM]: Successfully fetched property: mapreduce.jobhistory.address
[May 19, 2015 5:33:01 PM]: Distributed file system test started.
[May 19, 2015 5:33:01 PM]: Distributed file system test succeeded.
[May 19, 2015 5:33:01 PM]: MapReduce test started.
[May 19, 2015 5:33:02 PM]: MapReduce test succeeded.
[May 19, 2015 5:33:02 PM]: Radoop temporary directory test started.
[May 19, 2015 5:33:02 PM]: Radoop temporary directory test succeeded.
[May 19, 2015 5:33:02 PM]: MapReduce staging directory test started.
[May 19, 2015 5:33:02 PM]: MapReduce staging directory test succeeded.
[May 19, 2015 5:33:02 PM]: Connection test for 'master.example.com' completed successfully.
[May 19, 2015 5:33:07 PM]: --------------------------------------------------
[May 19, 2015 5:33:07 PM]: Connection test for 'master.example.com' started.
[May 19, 2015 5:33:07 PM]: Hive server 2 connection (???.???.???.??:10000) test started.
[May 19, 2015 5:33:07 PM]: Hive server 2 connection test succeeded.
[May 19, 2015 5:33:07 PM]: Retrieving required configuration properties...
[May 19, 2015 5:33:09 PM]: Successfully fetched property: yarn.resourcemanager.scheduler.address
[May 19, 2015 5:33:09 PM]: Successfully fetched property: yarn.resourcemanager.resource-tracker.address
[May 19, 2015 5:33:09 PM]: Successfully fetched property: yarn.resourcemanager.admin.address
[May 19, 2015 5:33:09 PM]: Successfully fetched property: yarn.application.classpath
[May 19, 2015 5:33:09 PM]: MapReduce Home added to yarn.application.classpath ($HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*)
[May 19, 2015 5:33:09 PM]: Successfully fetched property: mapreduce.jobhistory.address
[May 19, 2015 5:33:09 PM]: Distributed file system test started.
[May 19, 2015 5:33:09 PM]: Distributed file system test succeeded.
[May 19, 2015 5:33:09 PM]: MapReduce test started.
[May 19, 2015 5:33:09 PM]: MapReduce test succeeded.
[May 19, 2015 5:33:09 PM]: Radoop temporary directory test started.
[May 19, 2015 5:33:09 PM]: Radoop temporary directory test succeeded.
[May 19, 2015 5:33:09 PM]: MapReduce staging directory test started.
[May 19, 2015 5:33:09 PM]: MapReduce staging directory test succeeded.
[May 19, 2015 5:33:09 PM]: Connection test for 'master.example.com' completed successfully.
[May 19, 2015 5:33:09 PM]: --------------------------------------------------
[May 19, 2015 5:33:09 PM]: Integration test for 'master.example.com' started.
[May 19, 2015 5:33:09 PM]: The test may require several minutes to complete.
[May 19, 2015 5:33:09 PM]: Distributed file system upload started.
[May 19, 2015 5:37:10 PM] SEVERE: Test data upload to the distributed file system timed out. Please check the NameNode and DataNode services and their logs for traces of errors.
5 REPLIES

Re: Radoop Connection: Full Test Error

Hi,
I am absolutely a rookie in this topic, but does changing time out and other settings in your Rapidminer preferences change anything, it was successful for me.
Cheers
Sven

Re: Radoop Connection: Full Test Error

Thanks Sven,

But as you could see in my post, the the time difference is 4 seconds, and I set a very large time out number of seconds, and it still doesn't work, I think this is a file/directory permission issue, will check that then come post here again
RM Staff
RM Staff

Re: Radoop Connection: Full Test Error

Hi,

did you have a look on the cloudera manager logs?
And: I would again recommend to ask our support at support.rapidminer.com, since i think Sven and myself are the only persons here using Radoop

Cheers,
Martin
--------------------------------------------------------------------------
Head of Data Science Services at RapidMiner

Re: Radoop Connection: Full Test Error

Hi,
Overall, I really like the RADOOP extension. The main reason for that is it provides me the possibility to run processes without squeezing my computer while running. Martin will agree with me that the extra power you get gives you no reason to have the same attention to preprocessing your data. I hope more operators become available within RADOOP.  What is the planning on doing that? Down time is minimal in my set-up. Speed increases subjectively 10 times.
Cheers
Sven
RM Staff
RM Staff

Re: Radoop Connection: Full Test Error

If you look on the last 2 releases you can see that we added more predictive power by adding Linear Regression, Logistic Regression and the decision tree.
So i think this shows that we are activly putting more algorithmic power into radoop.

We had our radoop team in Dortmund last week. I got some time to talk with them - trust me, there is even more to come Smiley Happy
--------------------------------------------------------------------------
Head of Data Science Services at RapidMiner
Polls
How can RapidMiner increase participation in our new competitions?
Twitter Feed