Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

Out of memory error

Saurabh_Sawant_24Saurabh_Sawant_24 Member Posts: 7 Learner II
edited December 2018 in Help
I have a data set of  1.7 million of transactions running on windows server with 32 GB ram but still i am getting "Out of memory" error with HBOS algorithm. 
Can someone help?
Tagged:

Answers

  • Telcontar120Telcontar120 RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    You have a couple of options here.  These are listed in the order I would pursue them:
    1. Sample your dataset first, then define the outlier boundaries based on HBOS, and then create rules to tag outliers in your full dataset.  This should probably work just fine because you probably don't need all 1.7MM records to define your outliers using HBOS technique.
    2. Temporarily increase the size of your server RAM (if you are running in a cloud environment like AWS or Azure this is pretty easy to do).
    3. Try the RapidMiner Cloud offering, which lets you access a RapidMiner provided server to handle exceptionally large jobs on a per-credit-hour basis.
    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • M_MartinM_Martin RapidMiner Certified Analyst, Member Posts: 125 Unicorn
    In addition to the suggestions from Telcontar120 above, have you allocated the maximum feasible amount of memory to RapidMiner Studio on your Windows Server machine?  You mention that your machine has 32 GB total ram - but how much of that 32 GB have you allocated to RapidMiner Studio to access if need be? 
    From the Settings --> Preferences menu, you can specify how much ram RM Studio can maximally use.  One of the machines I use for RapidMiner development has 32GB of ram - and I have allocated up to 23 GB of ram on this machine for RapidMiner Studio as I have had processes fail due to running out of memory. 
    RapidMiner Studio doesn't automatically grab all of the memory you allocate when it loads, but if you allocate (for example) 20 GB of ram, RapidMiner Studio will use up to the amount of ram if needed.  The Resource Monitor panel will always show you how much ram RM Studio is using at any given time.
    I have also found it useful to close and restart RM Studio after running a memory intensive process - which frees up considerable memory given that at start up, RM Studio will not need to use all of the ram you have allocated in Settings -- Preferences.
    Hope this has been helpful and best wishes, Michael Martin
  • Saurabh_Sawant_24Saurabh_Sawant_24 Member Posts: 7 Learner II
    @M_Martin
    we are running this process on RM server allocating 25 GB ram to 1 job agent container and 5 GB for Studio
    After the process gets started we  close the studio to freeup the memory.
  • Saurabh_Sawant_24Saurabh_Sawant_24 Member Posts: 7 Learner II
    @Telcontar120
    As i am newbie to data science and rapidminer both
    Can to tell me how to define or identify boundaries on HBOS ?

    An example would really help me to understand. Thanks.
  • Telcontar120Telcontar120 RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    @Saurabh_Sawant_24 HBOS doesn't require you to define the boundaries for outliers, just the number of bins used to generate the histograms.  In my experience I have found it to be pretty robust such that it is not overly sensitive to this parameter, but you can try the default setting of -1 to start and then see what kind of results that generates.  [-1 is a special value that sets the bins at sqrt(N)].
    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • kanika15kanika15 Member Posts: 1 Learner I
    Hi, in the above scenario, if the pipeline fails on AI hub due to memory space issue can it be captured via exception handling?
Sign In or Register to comment.