ANNOUNCEMENT: RAPIDMINER 9.1 BETA HAS BEEN RELEASED TODAY!   PLEASE DOWNLOAD AND GIVE FEEDBACK. ENJOY AND HAPPY RAPIDMINING!   -- @sgenzer – Community Manager

Radoop Full Test failing

behroz89behroz89 Member Posts: 1 Learner I
edited November 10 in Help

I am new to Radoop and trying to setup a development enviornment. My setup is

- Virtual Machine (Ubuntu) running in Virtual Box (I am not using HDP Image)

- 5GB Ram is assinged to the VM

- Spark 2.0.0

- Hadoop 2.8.5

- Hive 2.3.3

 

My quick tests are all okay. When I run full tests, I get the following error

[Nov 4, 2018 7:50:46 PM]: Running test 17/25: Hive load data
[Nov 4, 2018 7:50:52 PM]: Test succeeded: Hive load data (6.356s)
[Nov 4, 2018 7:50:52 PM]: Running test 18/25: Import job
[Nov 4, 2018 7:51:07 PM] SEVERE: Test failed: Import job
[Nov 4, 2018 7:51:07 PM]: Cleaning after test: Import job
[Nov 4, 2018 7:51:07 PM]: Cleaning after test: Hive load data
[Nov 4, 2018 7:51:07 PM]: Cleaning after test: Radoop jar upload
[Nov 4, 2018 7:51:07 PM]: Cleaning after test: HDFS upload
[Nov 4, 2018 7:51:07 PM]: Cleaning after test: Create permanent UDFs
[Nov 4, 2018 7:51:07 PM]: Cleaning after test: UDF jar upload
[Nov 4, 2018 7:51:07 PM]: Cleaning after test: Spark assembly jar existence
[Nov 4, 2018 7:51:07 PM]: Cleaning after test: Spark staging directory
[Nov 4, 2018 7:51:07 PM]: Cleaning after test: MapReduce staging directory
[Nov 4, 2018 7:51:07 PM]: Cleaning after test: Radoop temporary directory
[Nov 4, 2018 7:51:07 PM]: Cleaning after test: MapReduce
[Nov 4, 2018 7:51:07 PM]: Cleaning after test: HDFS
[Nov 4, 2018 7:51:07 PM]: Cleaning after test: YARN services networking
[Nov 4, 2018 7:51:07 PM]: Cleaning after test: DataNode networking
[Nov 4, 2018 7:51:07 PM]: Cleaning after test: NameNode networking
[Nov 4, 2018 7:51:07 PM]: Cleaning after test: Java version
[Nov 4, 2018 7:51:07 PM]: Cleaning after test: Fetch dynamic settings
[Nov 4, 2018 7:51:07 PM]: Cleaning after test: Hive connection
[Nov 4, 2018 7:51:07 PM]: Total time: 22.634s
[Nov 4, 2018 7:51:07 PM]: java.lang.Exception: Import job failed, see the job logs on the cluster for details.
at eu.radoop.connections.service.test.integration.TestHdfsImport.call(TestHdfsImport.java:95)
at eu.radoop.connections.service.test.integration.TestHdfsImport.call(TestHdfsImport.java:40)
at eu.radoop.connections.service.test.RadoopTestContext.lambda$runTest$1(RadoopTestContext.java:279)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

[Nov 4, 2018 7:51:07 PM] SEVERE: java.lang.Exception: Import job failed, see the job logs on the cluster for details.
[Nov 4, 2018 7:51:07 PM] SEVERE: Test data import from the distributed file system to Hive server 2 failed. Please check the logs of the MapReduce job on the ResourceManager web interface at http://${yarn.resourcemanager.hostname}:8088.
[Nov 4, 2018 7:51:07 PM] SEVERE: Test failed: Import job
[Nov 4, 2018 7:51:07 PM] SEVERE: Integration test for 'VirtualBoxVM' failed.

In Yarn container logs, I see the following error

Error: Could not find or load main class org.apache.hadoop.mapreduce.v2.app.MRAppMaster

Further, If I run just the spark tests, then I get the following

My Spark Radoop settings ->

- Spark 2.0

- Assembly path -> hdfs:///spark/jars/*

- Resource Allocation Policy -> Static, Default Configuration

 

Logs

[Nov 4, 2018 7:55:44 PM]: Running test 3/4: HDFS upload
[Nov 4, 2018 7:55:44 PM]: Uploaded test data file size: 5642
[Nov 4, 2018 7:55:44 PM]: Test succeeded: HDFS upload (0.075s)
[Nov 4, 2018 7:55:44 PM]: Running test 4/4: Spark job
[Nov 4, 2018 7:55:44 PM]: Assuming Spark version Spark 2.0.
[Nov 4, 2018 7:56:38 PM]: Assuming Spark version Spark 1.4 or below.
[Nov 4, 2018 7:56:38 PM] SEVERE: Test failed: Spark job
[Nov 4, 2018 7:56:38 PM]: Cleaning after test: Spark job
[Nov 4, 2018 7:56:38 PM]: Cleaning after test: HDFS upload
[Nov 4, 2018 7:56:38 PM]: Cleaning after test: Spark staging directory
[Nov 4, 2018 7:56:38 PM]: Cleaning after test: Fetch dynamic settings
[Nov 4, 2018 7:56:38 PM]: Total time: 53.783s
[Nov 4, 2018 7:56:38 PM] SEVERE: com.rapidminer.operator.UserError: The specified Spark assembly jar, archive or lib directory does not exist or cannot be read.
[Nov 4, 2018 7:56:38 PM] SEVERE: The Spark test failed. Please verify your Hadoop and Spark version and check if your assembly jar location is correct. If the job failed, check the logs on the ResourceManager web interface at http://${yarn.resourcemanager.hostname}:8088.
[Nov 4, 2018 7:56:38 PM] SEVERE: Test failed: Spark job
[Nov 4, 2018 7:56:38 PM] SEVERE: Integration test for 'VirtualBoxVM' failed.

Resource Manager logs: (Full logs attached with the post)

User class threw exception: org.apache.spark.SparkException: Spark test failed: org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/tmp/radoop/training-vm/tmp_1541357744748_x0migqc

 

Apart from this, I have also attached my yarn-site.xml and mapred-site.xml

 

Any help would be much appreciated.

Tagged:

Answers

  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager Posts: 1,832  Community Manager
Sign In or Register to comment.