The Altair Community is migrating to a new platform to provide a better experience for you. The RapidMiner Community will merge with the Altair Community at the same time. In preparation for the migration, both communities are on read-only mode from July 15th - July 24th, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here.

Change storage location on Hadoop

bhupendra_patilbhupendra_patil Administrator, Employee, Member Posts: 168 RM Data Scientist

RapidMiner Radoop allows you to do code free data prep, blending, cleansing in a distributed fashion on Hadoop. A lot of times there is a need to store this data in Hadoop after the data cleansing steps are completed, Radoop’s “Store in hive” operator is an excellent way to store data in hive generally. But sometimes there is a need control the location(directory)  of where it is stored rather than relying on Hive to do the management.


To see the options needed for this make sure, you can selected to show the advanced parameters for the operator.


2016-08-04 18_40_27-RapidMiner - EY processes review and best practices - Meeting.png

To specify custom location one can still use the “Store in Hive Operator” and specify a custom location in the box highlighted below


2016-08-04 18_41_26-Cortana.png



The path can be an external location on HDFS or on amazon s3. For amazon Use the  s3://<bucket>/<path> or s3n://<bucket>/<path> format to specify the destination directory (it will be created if it does not exist). Please note that in this case the target directory can not be checked or emptied beforehand, since it can not be accessed directly without AWS credentials

Sign In or Register to comment.