Change storage location on Hadoop

bhupendra_patil · August 2016

RapidMiner Radoop allows you to do code free data prep, blending, cleansing in a distributed fashion on Hadoop. A lot of times there is a need to store this data in Hadoop after the data cleansing steps are completed, Radoop’s “Store in hive” operator is an excellent way to store data in hive generally. But sometimes there is a need control the location(directory) of where it is stored rather than relying on Hive to do the management.

To see the options needed for this make sure, you can selected to show the advanced parameters for the operator.

2016-08-04 18_40_27-RapidMiner - EY processes review and best practices - Meeting.png

To specify custom location one can still use the “Store in Hive Operator” and specify a custom location in the box highlighted below

2016-08-04 18_41_26-Cortana.png

The path can be an external location on HDFS or on amazon s3. For amazon Use the s3://<bucket>/<path> or s3n://<bucket>/<path> format to specify the destination directory (it will be created if it does not exist). Please note that in this case the target directory can not be checked or emptied beforehand, since it can not be accessed directly without AWS credentials

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

Change storage location on Hadoop