Options

Spark job could not succeed for any supported Spark Version on Cloudera

Pavithra_RaoPavithra_Rao Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 123 RM Data Scientist
edited December 2018 in Knowledge Base

Symptoms

Error Message while running Full-Test to connect to Cloudera Cluster from RapidMiner platform.

"The Spark job could not succeed for any supported Spark Version. It seems that the specified assembly jar or its location is incorrect: local:///opt/cloudera/parcels/CDH/lib/spark/lib/spark-assembly.jar

Diagnosis

  • Verified that the spark-assembly.jar is located on all the nodes.
  • Made sure there is no version mismatch between Spark version selected in Configuration Properties of Radoop Manage Connections window and Spark version of the Hadoop cluster

Solution

Cloudera's latest Spark builds (shipped with CDH 5.11 and 5.12) differ somewhat from the corresponding Apache Spark versions (they don't accept executor-cores and executor-memory options).

It is perfectly fine with using an Apache Spark release, that can be installed on HDFS with the following, or similar commands:          

 

# do a kinit call, if Kerberos is used on the cluster
wget -O /tmp/spark-1.6.3-bin-hadoop2.6.tgz https://d3kbcqa49mib13.cloudfront.net/spark-1.6.3-bin-hadoop2.6.tgz
tar xzvf /tmp/spark-1.6.3-bin-hadoop2.6.tgz -C /tmp/
hadoop fs -mkdir -p /tmp/spark
hadoop fs -put /tmp/spark-1.6.3-bin-hadoop2.6/lib/spark-assembly-1.6.3-hadoop2.6.0.jar /tmp/spark/

In this case, the specified assembly location in the Radoop connection should be:

"hdfs:///tmp/spark/spark-assembly-1.6.3-hadoop2.6.0.jar"

Tagged:
Sign In or Register to comment.