RapidMiner

Custom storage handlers on Hadoop when using Radoop "Store in Hive"

by RMStaff ‎08-04-2016 02:33 PM - edited ‎08-09-2016 08:22 AM

When using RapidMiner Radoop "Store in Hive" operator there may be a need to use some custom storage handlers.

Storage handlers make it possibe to allow Hive to access data stored and managed by other systems.

 RapidMiner’s “Store in Hive” operator provides a lot of flexibility when it comes to saving the data in hive or external tables in HDFS of Amazon S3.

Additionally custom storage handles may allow you to use Hypertable, Cassandra, JDBC, MongoDB, Google Spreadsheets as documented here

 

To enable custom Storage ensure you have the advanced parameters visible like below.

Now click on the “Custom Storage” checkbox to explore options for using custom storage handlers

 

store in hive Radoop .png

 

Once you click on the "custom storage" option, additional options are made available as below .

When providing the custom storage handle you need to ensure that it must exist in the CLASSPATH of the hive server.

 

 

2016-08-04 19_26_08-Cortana.png

 

The user defined SerDe properties can be then added by clicking the “Edit List” button.

Please note that the SerDe properties are case sensitive

2016-08-04 19_28_14-_new process__ – RapidMiner Studio Developer 7.2.000 @ RMUS-BPATIL.png

 

 

 

  Download Rapidminer Radoop for free today from http://bit.ly/RadoopDL