How to delete a file from a S3 Bucket folder

PBMPBM Member Posts: 6 Contributor II
edited September 2020 in Help
Hi, 
We use Amazon S3 to load data into Amazon Redshift Database. After data is loaded we want to clean up the S3 text files. I see the Read, Loop, and Write S3 operators. Does RM allow Deleting files from S3, considering that I have access to delete in S3?  Otherwise any workaround suggestions?
Thank You
Tagged:

Best Answer

Answers

  • PBMPBM Member Posts: 6 Contributor II
    Hi all,

    Ok it seems that this is a case where RapidMiner does not have the object but then you can extend it by using external scripts. In this case I have used the Execute Python script and boto library. 

    from boto.s3.connection import S3Connection
    S3_BUCKET_NAME='myBucket'
    AWS_ACCESS_KEY='myAccesskey'
    AWS_SECRET_KEY='mySecretKey'
    path_to_file='mysubFolderPath'
    def rm_main():
      # Create connection
      conn = S3Connection(AWS_ACCESS_KEY, AWS_SECRET_KEY)
             
      # Connet to my  bucket
      bucket = conn.get_bucket(S3_BUCKET_NAME)
         
      # Get subdirectory info and delete files (except the subdirectory itself)
      for key in bucket.list(prefix=path_to_file, delimiter='/'):
        if key.name != path_to_file:
          bucket.delete_key(key)
      return
    if __name__ == "__main__":
       rm_main()

    Thank You 
Sign In or Register to comment.