RapidMiner

Contributor I kevin_m
Contributor I

Spark Job Failed - Connection from Cloudera to Radoop

Hello can someone help me. I get the following error message when I connecting Cloudera to Rapidminer Radoop. All services except Spark work and can be connected. What do I have to do that the Spark job runs as well? Thanks in advance.

 

[Jul 6, 2017 8:31:03 PM]: Integration test for 'myCloudera' started.
[Jul 6, 2017 8:31:03 PM]: Using Radoop version 7.5.0.
[Jul 6, 2017 8:31:03 PM]: Running tests: [Hive connection, Fetch dynamic settings, Java version, HDFS, MapReduce, Radoop temporary directory, MapReduce staging directory, Spark staging directory, Spark assembly jar existence, UDF jar upload, Create permanent UDFs, HDFS upload, Spark job]
[Jul 6, 2017 8:31:03 PM]: Running test 1/13: Hive connection
[Jul 6, 2017 8:31:03 PM]: Hive server 2 connection (master.cluster.com:10000) test started.
[Jul 6, 2017 8:31:03 PM]: Test succeeded: Hive connection (0.177s)
[Jul 6, 2017 8:31:03 PM]: Running test 2/13: Fetch dynamic settings
[Jul 6, 2017 8:31:03 PM]: Retrieving required configuration properties...
[Jul 6, 2017 8:31:04 PM]: Successfully fetched property: hive.execution.engine
[Jul 6, 2017 8:31:04 PM]: Successfully fetched property: mapreduce.jobhistory.done-dir
[Jul 6, 2017 8:31:04 PM]: Successfully fetched property: mapreduce.jobhistory.intermediate-done-dir
[Jul 6, 2017 8:31:04 PM]: Successfully fetched property: dfs.user.home.dir.prefix
[Jul 6, 2017 8:31:04 PM]: Could not fetch property dfs.encryption.key.provider.uri
[Jul 6, 2017 8:31:04 PM]: Successfully fetched property: spark.executor.memory
[Jul 6, 2017 8:31:04 PM]: Successfully fetched property: spark.executor.cores
[Jul 6, 2017 8:31:04 PM]: Successfully fetched property: spark.driver.memory
[Jul 6, 2017 8:31:04 PM]: Could not fetch property spark.driver.cores
[Jul 6, 2017 8:31:04 PM]: Successfully fetched property: spark.yarn.executor.memoryOverhead
[Jul 6, 2017 8:31:04 PM]: Successfully fetched property: spark.yarn.driver.memoryOverhead
[Jul 6, 2017 8:31:04 PM]: Successfully fetched property: spark.dynamicAllocation.enabled
[Jul 6, 2017 8:31:04 PM]: Successfully fetched property: spark.dynamicAllocation.initialExecutors
[Jul 6, 2017 8:31:04 PM]: Successfully fetched property: spark.dynamicAllocation.minExecutors
[Jul 6, 2017 8:31:04 PM]: Successfully fetched property: spark.dynamicAllocation.maxExecutors
[Jul 6, 2017 8:31:04 PM]: Could not fetch property spark.executor.instances
[Jul 6, 2017 8:31:04 PM]: The specified local value of mapreduce.job.reduces (2) differs from remote value (-1).
[Jul 6, 2017 8:31:04 PM]: The specified local value of mapreduce.reduce.speculative (false) differs from remote value (true).
[Jul 6, 2017 8:31:04 PM]: Test succeeded: Fetch dynamic settings (0.663s)
[Jul 6, 2017 8:31:04 PM]: Running test 3/13: Java version
[Jul 6, 2017 8:31:04 PM]: Cluster Java version: 1.8.0_131-b11
[Jul 6, 2017 8:31:04 PM]: Test succeeded: Java version (0.000s)
[Jul 6, 2017 8:31:04 PM]: Running test 4/13: HDFS
[Jul 6, 2017 8:31:04 PM]: Test succeeded: HDFS (0.291s)
[Jul 6, 2017 8:31:04 PM]: Running test 5/13: MapReduce
[Jul 6, 2017 8:31:04 PM]: Test succeeded: MapReduce (0.106s)
[Jul 6, 2017 8:31:04 PM]: Running test 6/13: Radoop temporary directory
[Jul 6, 2017 8:31:04 PM]: Test succeeded: Radoop temporary directory (0.306s)
[Jul 6, 2017 8:31:04 PM]: Running test 7/13: MapReduce staging directory
[Jul 6, 2017 8:31:05 PM]: Test succeeded: MapReduce staging directory (0.357s)
[Jul 6, 2017 8:31:05 PM]: Running test 8/13: Spark staging directory
[Jul 6, 2017 8:31:05 PM]: Test succeeded: Spark staging directory (0.316s)
[Jul 6, 2017 8:31:05 PM]: Running test 9/13: Spark assembly jar existence
[Jul 6, 2017 8:31:05 PM]: Spark assembly jar existence in the local:// file system cannot be checked. Test skipped.
[Jul 6, 2017 8:31:05 PM]: Test succeeded: Spark assembly jar existence (0.000s)
[Jul 6, 2017 8:31:05 PM]: Running test 10/13: UDF jar upload
[Jul 6, 2017 8:31:05 PM]: Remote radoop_hive-v4.jar is up to date.
[Jul 6, 2017 8:31:05 PM]: Test succeeded: UDF jar upload (0.300s)
[Jul 6, 2017 8:31:05 PM]: Running test 11/13: Create permanent UDFs
[Jul 6, 2017 8:31:06 PM]: Remote radoop_hive-v4.jar is up to date.
[Jul 6, 2017 8:31:06 PM]: Test succeeded: Create permanent UDFs (0.745s)
[Jul 6, 2017 8:31:06 PM]: Running test 12/13: HDFS upload
[Jul 6, 2017 8:31:07 PM]: Uploaded test data file size: 5642
[Jul 6, 2017 8:31:07 PM]: Test succeeded: HDFS upload (1.241s)
[Jul 6, 2017 8:31:07 PM]: Running test 13/13: Spark job
[Jul 6, 2017 8:31:47 PM] SEVERE: Test failed: Spark job
[Jul 6, 2017 8:31:47 PM]: Cleaning after test: Spark job
[Jul 6, 2017 8:31:48 PM]: Cleaning after test: HDFS upload
[Jul 6, 2017 8:31:48 PM]: Cleaning after test: Create permanent UDFs
[Jul 6, 2017 8:31:48 PM]: Cleaning after test: UDF jar upload
[Jul 6, 2017 8:31:48 PM]: Cleaning after test: Spark assembly jar existence
[Jul 6, 2017 8:31:48 PM]: Cleaning after test: Spark staging directory
[Jul 6, 2017 8:31:48 PM]: Cleaning after test: MapReduce staging directory
[Jul 6, 2017 8:31:48 PM]: Cleaning after test: Radoop temporary directory
[Jul 6, 2017 8:31:48 PM]: Cleaning after test: MapReduce
[Jul 6, 2017 8:31:48 PM]: Cleaning after test: HDFS
[Jul 6, 2017 8:31:48 PM]: Cleaning after test: Java version
[Jul 6, 2017 8:31:48 PM]: Cleaning after test: Fetch dynamic settings
[Jul 6, 2017 8:31:48 PM]: Cleaning after test: Hive connection
[Jul 6, 2017 8:31:48 PM]: Total time: 45.007s
[Jul 6, 2017 8:31:48 PM] SEVERE: The Spark test failed. Please verify your Hadoop and Spark version and check if your assembly jar location is correct. If the job failed, check the logs on the ResourceManager web interface at http://master.cluster.com:8088.
[Jul 6, 2017 8:31:48 PM] SEVERE: Test failed: Spark job
[Jul 6, 2017 8:31:48 PM] SEVERE: Integration test for 'myCloudera' failed.

 


Unbenannt.PNG

 

 

4 REPLIES
RM Staff
RM Staff

Re: Spark Job Failed - Connection from Cloudera to Radoop

Hello,

 

can you please follow the first "logs" link in the second screen? It should show the stderr, stdout outputs. If they are too long, than there are links to the full content. The error message and/or stacktrace from those should help more in figuring out the cause.

 

Thanks,

Peter

Contributor I kevin_m
Contributor I

Re: Spark Job Failed - Connection from Cloudera to Radoop

Log Type: stderr

Log Upload Time: Fr Jul 07 14:27:32 +0200 2017

Log Length: 213

Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
Fehler: Hauptklasse org.apache.spark.deploy.yarn.ApplicationMaster konnte nicht gefunden oder geladen werden
 

Log Type: stdout

Log Upload Time: Fr Jul 07 14:27:32 +0200 2017

Log Length: 0

Highlighted
RM Staff
RM Staff
Solution

Re: Spark Job Failed - Connection from Cloudera to Radoop

The Spark assembly jar could not be found on the specified location. Since it is a local address, it means that the file / directory (Spark 2.x) must exist on all nodes at the specified path. So, for example, the default Assembly Jar Location is "local:///opt/cloudera/parcels/CDH/lib/spark/lib/spark-assembly.jar", in that case, on all nodes this path must exist: /opt/cloudera/parcels/CDH/lib/spark/lib/spark-assembly.jar

 

If it is somewhere else, the address must be modified. It is also possible to download arbitrary Spark library from spark.apache.org, upload it the HDFS, and specify a HDFS location (with the prefix "hdfs://") and choose the proper Spark version.

 

Peter

Contributor I kevin_m
Contributor I

Re: Spark Job Failed - Connection from Cloudera to Radoop

I found the right jar file after Spark update to Version 2.1. Thx
Polls
How can RapidMiner increase participation in our new competitions?
Twitter Feed