The Altair Community is migrating to a new platform to provide a better experience for you. The RapidMiner Community will merge with the Altair Community at the same time. In preparation for the migration, both communities are on read-only mode from July 15th - July 24th, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here.

Splitting output into multiple (many) csv

MichaelWallMichaelWall Member Posts: 9 Contributor II
edited November 2018 in Help



Question from a newbee. I have a process built in RapidMiner studio that creates an output containing anywhere between 100 and 5000 rows (depending on starting input). I want to write out the output as one csv per row. At the moment I can get the full data set using the Write CSV operator, but that just gives me one file with everything, when I want 1 csv per record. I've tried doing this in post-processing by adding a new section to the Python script that handles the data after it's been through the process, but the formatting of the CSV is causing problems. I really want it to come out of RapidMiner in separate files to maintain the integrity of the results.


Any thoughts appreciated?



Best Answer

  • Options
    bhupendra_patilbhupendra_patil Administrator, Employee, Member Posts: 168 RM Data Scientist
    Solution Accepted

    Hi @MichaelWall

    Welcome to RapidMiner community.

    See if the attached process helps you. You can open this process from FIle>>Import Process

    You may need to change path of the csv location

    But here is what it does

    I am going to loop examples(rows), basically one row at a time,

    Inside the loop you filter to current row number and then write that one row to one csv


    the filename is the rownumber.csv


    If you need to name the file differenty, then that should be possible with additonal operator, but hopefully this will get you started


  • Options
    MichaelWallMichaelWall Member Posts: 9 Contributor II

    Thanks for this, works really well, much faster than the existing process I am replicating. The key thing was to set the iteration macro on the Loop Examples operator to row_number so it indexed through each row.

Sign In or Register to comment.