Options

importing large XLS OR CSV dataset issue

arnoarno Member Posts: 4 Contributor I
edited November 2018 in Help
Hi everyone!

I am just starting out with Rapidmier and I must say i'm really impressed by it.
I do however encounter an issue when i try to import my dataset.

the dataset is a 100.000 examples, 80 attributes  xls file (about 16 MB).
It imports fine but gets stuck when rapidminer stries to show the" ExampleSet" results output.
Apparently the file is too big to actually view?

I already updated my java to 64bit (on my 64bit windows 8 OS). I have 12GB ram and changed the RapidMinerGUI.bat file so it allows to take 8GB of RAM instead of the default 512MB.

Is there a possibility to turn off viewing the dataset in the results? Or is it something else I should take into account to run this dataset?
I tried importing the same dataset but just the first 10.000 rows, and this went smoothly so I am quiet convinced it must by the amount of data which makes
rapidminer "hang".

Does anyone here have some experience with datasets this large and have some hints for me?

thanks in advance,

Arno

Answers

  • Options
    MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Hi Arno,

    in RapidMiner, please have a look at the System Monitor view. How much memory does it consume when you try to load the big data set? How much when loading the small one?

    If you do not want to display the result, just don't connect the output of the last operator to the process output on the far right. If you only want to display parts of it, please use a Sample operator to reduce the size of the data.

    It is however strange if the data really loads fine, but does not display.... we'll have a look at that.

    Best regards,
    Marius
Sign In or Register to comment.