🎉 🎉 RAPIDMINER 9.10 IS OUT!!! 🎉🎉
Download the latest version helping analytics teams accelerate time-to-value for streaming and IIOT use cases.
Store in hive using custom SerDe
RapidMiner Radoop’s “Store in Hive” operator is a versatile operator to allow you to save data in hive or external tables.. This article describes how to enable custom storage and use a DELIMITED row format while storing.
Please ensure that the advanced parameters are enabled when you need to use DELIMITED format.
Once the custom storage option is clicked you will have addtional options, change the row format box to "Custom SerDe" as highlighted below
Then provide the serde classname. Please ensure that exist in the classpath of the hive server.
Additional serde properties can be set by clicking on the "Edit List' option. These case sensitive key value pairs are passed on to the tables serde.
List of built in serde and how to write your own serde look at this link https://cwiki.apache.org/confluence/display/Hive/SerDe
You can also select addtional hive file format settings or impala file format settings in the addtional options available. Please note that older hive versions may not support some of the file formats. The default hive file formats supported as of version 7.2(Aug 2016) of Radoop are TEXTFILE, RCFILE, ORC, SEQUENCEFILE, PARQUET AND custom format.
Additional options for inputformat and output format for when using customformat is exposed on selecting that option