RapidMiner 9.1 RAM usage

varunm1varunm1 Member Posts: 563   Unicorn
edited December 2018 in Help
Hello,

After installing RM 9.1 on my PC with 32GB RAM. I see that this version is using 4 GB at the start without running any process. Is this normal? When I try to apply Deep learning operator with CNN for classification the software is crashing. I observe the task manager and see that this might be because of growing ram usage and running out of memory. The dataset size is 315000*102 and sparse. CSV file size is 65MB. 

@sgenzer @hughesfleming68 any suggestions on this?

Regards,
Varun 

Best Answer

Answers

  • varunm1varunm1 Member Posts: 563   Unicorn
    Hi Alex, I set it manually to 20 GB but the task manager shows memory greater than 20 GB for the RM process. I understood the issue based on your explanation as this cannot be avoided with GUI usage.
  • SGolbertSGolbert RapidMiner Certified Analyst, Member Posts: 323   Unicorn
    Hi Varun,

    can you load the dataset into RapidMiner (Retrieve without further operators, run and see results) and see how is the memory consumption? I could be that the RM table is using memory for all the missing values, in which case the dataset is very large. In that case you have to use the Declare Missing Values operator and make sure that RM sees them as missing and not as an empty string for example.

    Regards,
    Sebastian


  • varunm1varunm1 Member Posts: 563   Unicorn
    Hi Sebastian,

    Thanks for your response. The RM GUI is taking 3.5 GB approx with a blank process window. I ran the dataset as mentioned by you and its closer to 5 GB. The dataset doesn't have missing values as it is generated by another algorithm in Matlab. Please see screenshots below for your reference.

    Initial GUI:



    While data is getting processed:


    Thanks,
    Varun
  • David_ADavid_A Moderator, Employee, RMResearcher, Member Posts: 177  RM Research
    Hi,

    don't be confused about the RAM usage for an empty process. Java tends to allocate a lot of memory in advance, that's what you see there.

    Are you using the Deep Learning operator from core RapidMiner (with the H2O library) or the Deep Learning extension?
    mschmitzvarunm1
  • varunm1varunm1 Member Posts: 563   Unicorn
    Hi @David_A

    Deep Learning Extension. I also got to know from @hughesfleming68 there are some memory leaks which he got to your notice.

    Thanks
    Varun
  • hughesfleming68hughesfleming68 Member Posts: 220   Unicorn
    edited January 4
    Hi Varun, those might be fixed as they were for an earlier version of RM9. It was one process that was using 40G of memory. There were actually over 500 operators in that process. It worked correctly on 8.2 but I have not tested it again. I will check over the weekend. In the end, I duplicated the whole process in python and I have been using it that way ever since.

    Edit: I just tested it again and it worked fine in 9.1. Still used 37G but didn't crash. It was fast as well, faster than I remember.
    varunm1
  • SGolbertSGolbert RapidMiner Certified Analyst, Member Posts: 323   Unicorn

    I mentioned the missing values, because RapidMiner has different internal structures to represent example sets. One of them is optimized for sparse matrices, but if the values doesn't appear as missing to RapidMiner (even though they were missing in some other program or database), this data structure won't be loaded. It's just a hunch, as I don't know the internal functions in detail.

    Another question is whether the extensions can make use of these optimized data structures.

    Regards,
    Sebastian
    varunm1
Sign In or Register to comment.