Options

Out of memory error

Saurabh_Sawant_24Saurabh_Sawant_24 Member Posts: 7 Contributor I
edited December 2018 in Help
I have a data set of  1.7 million of transactions running on windows server with 32 GB ram but still i am getting "Out of memory" error with HBOS algorithm. 
Can someone help?
Tagged:

Answers

  • Options
    Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    You have a couple of options here.  These are listed in the order I would pursue them:
    1. Sample your dataset first, then define the outlier boundaries based on HBOS, and then create rules to tag outliers in your full dataset.  This should probably work just fine because you probably don't need all 1.7MM records to define your outliers using HBOS technique.
    2. Temporarily increase the size of your server RAM (if you are running in a cloud environment like AWS or Azure this is pretty easy to do).
    3. Try the RapidMiner Cloud offering, which lets you access a RapidMiner provided server to handle exceptionally large jobs on a per-credit-hour basis.
    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • Options
    M_MartinM_Martin RapidMiner Certified Analyst, Member Posts: 125 Unicorn
    In addition to the suggestions from Telcontar120 above, have you allocated the maximum feasible amount of memory to RapidMiner Studio on your Windows Server machine?  You mention that your machine has 32 GB total ram - but how much of that 32 GB have you allocated to RapidMiner Studio to access if need be? 
    From the Settings --> Preferences menu, you can specify how much ram RM Studio can maximally use.  One of the machines I use for RapidMiner development has 32GB of ram - and I have allocated up to 23 GB of ram on this machine for RapidMiner Studio as I have had processes fail due to running out of memory. 
    RapidMiner Studio doesn't automatically grab all of the memory you allocate when it loads, but if you allocate (for example) 20 GB of ram, RapidMiner Studio will use up to the amount of ram if needed.  The Resource Monitor panel will always show you how much ram RM Studio is using at any given time.
    I have also found it useful to close and restart RM Studio after running a memory intensive process - which frees up considerable memory given that at start up, RM Studio will not need to use all of the ram you have allocated in Settings -- Preferences.
    Hope this has been helpful and best wishes, Michael Martin
  • Options
    Saurabh_Sawant_24Saurabh_Sawant_24 Member Posts: 7 Contributor I
    @M_Martin
    we are running this process on RM server allocating 25 GB ram to 1 job agent container and 5 GB for Studio
    After the process gets started we  close the studio to freeup the memory.
  • Options
    Saurabh_Sawant_24Saurabh_Sawant_24 Member Posts: 7 Contributor I
    @Telcontar120
    As i am newbie to data science and rapidminer both
    Can to tell me how to define or identify boundaries on HBOS ?

    An example would really help me to understand. Thanks.
  • Options
    Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    @Saurabh_Sawant_24 HBOS doesn't require you to define the boundaries for outliers, just the number of bins used to generate the histograms.  In my experience I have found it to be pretty robust such that it is not overly sensitive to this parameter, but you can try the default setting of -1 to start and then see what kind of results that generates.  [-1 is a special value that sets the bins at sqrt(N)].
    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • Options
    kanika15kanika15 Member Posts: 1 Newbie
    Hi, in the above scenario, if the pipeline fails on AI hub due to memory space issue can it be captured via exception handling?
Sign In or Register to comment.