12-07-2016 08:13 AM
I have managed to connect Hive and Spark and Hadoop and setup Radoop connection. I am now working with Radoop Nest in an example of "Titanic" data. I have put the titanic data in Hive and want to use Radoop Validation process on the data. The running process failes with this error:
HiveQL problem Message: Error running query: java.lang.NoClassDefFoundError: scala/collection.Iterable
Where do you think is my problem?
12-07-2016 08:37 AM
the issue is probably related to the Hive classpath on your Hadoop cluster. Let me ask a few details to make the problem solving easier:
12-07-2016 10:30 AM
I changed the "hive.execution.engine" to "mr" , and I received a response from Rapidminer that "The capabilites are insufficient on the data".
For the Full test on the Radoop connection, I received an error at the test number 18, when it is bout "Import job into Hive". The full zip file of the test I completed by extracting the logfile, and I have placed it in the attachment, is this alright as log ? Or is it another step I need to show the log ?
12-08-2016 04:31 AM
You are absolutely right and I had a white space after the "localhost" at the JobHistory server.
I corrected that and rerun the full test, still I have the same problem at the test18 of the FullRadoop connection test, at the "Job Import".
Could you the new Log zipfile, it is attached.
And just to let you know about the Hadoop and Hive and Yarn, I have installed Hadoop and Hive myself, by downloading the binaries from Apache site, and configured it from beginning, so I am not using Cloudera, but it seems that everything I have configured is not enough, and there some parameters missing or not configured.
12-08-2016 05:31 AM
it seems that you've set a few special settings in your connection as Advanced Hadoop Parameters. Radoop automatically sets the commonly required Hadoop properties, so there is no need to define e.g. fs.default.name as an advanced parameter.
Are you using KMS on your cluster? If you have not configured it, the related properties are most likely not needed, you can safely turn off all KMS-related settings.
In general, I'd suggest to disable every Advanced Hadoop Property you have in your connection and re-run the Full Test.
(By the way, are you sure that your NameNode runs on port 54310? This is quite unusual.)