The Altair Community is migrating to a new platform to provide a better experience for you. The RapidMiner Community will merge with the Altair Community at the same time. In preparation for the migration, both communities are on read-only mode from July 15th - July 24th, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here.

Reading non-standard data files structures - pls help

roger_ilesroger_iles Member Posts: 1 Contributor I
edited November 2018 in Help

I  am evaluating RapidMiner as a solution to performing research and applicatioin prototyping. It's important to have an easy way to import data easily and manipulate it into the structure I need it before storing the result to a DB - I need to create this capability to work repetetively for many files. 

However, I have hit an early block, as although I can read in data from a file containing a standard table, I hit issues if the file contains a slightly different structure. Is there a straightforward way to read in csv and excel data when the header structure is either not standard or even repeats (e.g. multiple data sets in one file appended one after another)


I have provided one example of one of the data files below, in which one of the columns is time, however there is no date column as the date is instead stored as meta data in the top of the file. I need to add the date to the time to create a date-time column but I can't find a straightforward way to read in the different parts of the data file - meta data and column data - separately and consequently perform the data transformation to create a new table to store to the DB.


Any advice would be welcome.




ABC Aircraft Registration      
XYZ Nose Number      
123 Flight Number      
CDE Departure Station      
FGH Destination Station      
31.10.2014 Date        
785 GROUND 18:02:44 11.97018 -8.92304 874
845 GROUND 18:12:44 21.9698 -7.9315 881
905 GROUND 18:22:43 31.96881 -6.93081 892


  • Options
    bhupendra_patilbhupendra_patil Administrator, Employee, Member Posts: 168 RM Data Scientist

    see if the attached example and accompanying videos gives you some ideas

  • Options
    Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    Yes, you can comment out the repeating header lines in the Read CSV or XLS wizard. I do this all the time with NOAA weather data. The Read CSV operator is like the swiss army knife of data loaders in RapidMiner, it can handle many other different file formats and encoding too. I used to read in txt files too.

Sign In or Register to comment.