RapidMiner Radoop allows you to do code free data prep, blending, cleansing in a distributed fashion on Hadoop. A lot of times there is a need to store this data in Hadoop after the data cleansing steps are completed, Radoop’s “Store in hive” operator is an excellent way to store data in hive generally. But sometimes there is a need control the location(directory) of where it is stored rather than relying on Hive to do the management.
To see the options needed for this make sure, you can selected to show the advanced parameters for the operator.
To specify custom location one can still use the “Store in Hive Operator” and specify a custom location in the box highlighted below
The path can be an external location on HDFS or on amazon s3. For amazon Use the s3://<bucket>/<path> or s3n://<bucket>/<path> format to specify the destination directory (it will be created if it does not exist). Please note that in this case the target directory can not be checked or emptied beforehand, since it can not be accessed directly without AWS credentials