Hello altogether,


I have got a problem concerning the input data. I want to retrieve a 2GB CSV file, but everytime the operator stops at 40%, then the error message, that memory is not enough occurs (I have 16GB RAM). What can I do about that? Since Rapidminer is a data mining software I expected it to to things like that easily?

    Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn

    My first suggestion would be to try splitting your csv into smaller chunks (using any free utility like csv splitter) and then reading them in using a loop files operator and joining/appending them together.  I don't know how RapidMiner handles memory management for large files like that.  Perhaps one of the RM staffers will have another suggestion.

    zprekopcsakzprekopcsak RapidMiner Certified Expert, Member Posts: 47 Guru


    Can you share a data sample so we can investigate?

    Without not knowing anything about the data, I would have two suggestions to try:

    1. Make sure that the attribute types are properly set in the import wizard. If you store a datetime as a nominal instead of a proper datetime then you grow the memory footprint significantly. Same with attributes that have only missing values in the first X rows. RapidMiner will not be able to guess their types so unless you set that manually, it will default to nominal.
    2. You may want to try the in-product beta mode that has a lower memory footprint in general. See more details here: http://static.rapidminer.com/rnd/html/rapidminer-7.3-beta-mode.html



    eldenosoeldenoso Member Posts: 65 Contributor I

    The only idea I came up with (after searching the web) was import the CSV into an sql program (e.g. postgres), so that I can use the stream data operator? 

    Unfortunately I cannot upload the data but I can tell you everything. It contains 8 attributes and round about 80 million examples. The only attribute I had to change with the import wizard was the date (it was set wrong). Ironically if I don't change the date-type the import succeeds.

    Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    Sometimes dates are not read in correctly in the Read CSV operator, but that's OK. You can always convert those date values by using Nominal to Date operator.

