Options

Read TSV & Read Multiple files from S3

MichaelWallMichaelWall Member Posts: 9 Contributor II
edited December 2018 in Help

Hi All,

 

I have three related questions, I'm tying to read in files from Amazon S3. The Amazon Read S3 operator works fine, but I have three problems:

1) I'm trying to read in tsv files, so I've connected 'Read Amazon S3' to 'Read CSV' but it results in no records. I also tried with 'Read Excel' but that just throws errors. Is there an operator that can handle tsv files?

2) Eventually I want to be able to read all files in an S3 bucket, rather than just selecting one. So is there a way of looping through all the files?

3) Are there any operators I can apply to filter multiple file types through the workflow, so they get routed to the appropriate 'Read' operator? A bucket may have different file formats in it, and the process will throw errors if it try to read in the wrong file extention.

 

Thanks

 

Mike

Tagged:

Answers

  • Options
    MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,525 RM Data Scientist

    Dear Mike,

     

    for 1): Have you tried Read CSV with tab as a delimiter? The default takes ; as a delimiter.

    2) Have you had a look at Loop Amazon S3?

    3) Loop Amazon has a filter option where you can use .+tsv as a regex to just include tsv files in the loop.

     

    Best,

    Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.