Options

running process on cloud , regular/large/extra large

mskinnermskinner Member Posts: 10 Contributor I
edited November 2018 in Help

when i what to run a process  on cloud how do i know which  size to choose?

Answers

  • Options
    Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn

    It's really up to you, choosing a larger size will simply allow your process to run more quickly if the dataset is large.  That's true unless you get "out of memory" errors when you run it locally, in which case you may need a larger size to ensure it finishes at all. 

     

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • Options
    hughesfleming68hughesfleming68 Member Posts: 323 Unicorn

    I did some experimenting with this on AWS using an instance with 36 vCPUs. That configuaration is basically a dual cpu Intel server with nine actual cores, 18 threads per cpu and lots of memory.

     

    What stood out was that I could only ever get Rapdiminer Studio to use one CPU (max 18 threads) in this case. It was not that fast either. I decided that the cloud was not for me after that experience.

     

    regards,


    Alex

     

     

     

     

  • Options
    Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn

    Hi Alex, and thanks for sharing the results of your testing.  I am not sure when it was done, but it is the case that how many cores RapidMiner uses varies depending on the operators that are being utilized.  RapidMiner has recently made progress in taking advantage of parallel processing by making more of the most commonly-used processing-intensive operators able to parallelize their work.  See this recent announcement, for example, about the changes to the cross-validation operator which was just released earlier this month: https://rapidminer.com/new-parallel-cross-validation/

     

    So if your AWS testing was a while ago, you might want to redo it at some point in the future to take advantage of the newer operators.  I have tested the new cross validation operator and it is definitely faster than the prior version.

     

     

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
Sign In or Register to comment.