RM Cloud feature requests

sgenzer · January 2015

1. Be able to see log in real-time
2. If cannot see log in real-time, then be able to at least see log for stopped processes.
3. Have RM Cloud automatically stop process if it is running into infinite loop errors
4. Have more options for instance performance. My Mac Pro seems to beat all others in at least certain processes, which I imagine is not the idea.

more soon as I continue to beta test...

Scott

Marco_Boeck · January 2015

Hi,

thank you for your suggestions. We have thought about these things before, so let me answer this:

1) Definitely desirable. We decided against it for several reasons including some performance concerns, however we might look at this again in the future.
2) I agree.
3) When you build an infinite loop via "Execute Proces" operators we warn you before you submit your job to the Cloud. For example if process 1 references process 2 which references process 3 which references process 1 you will get a warning. Only a warning because you might have conditions in your processes which prevent if from being an actual loop. Other than that, I'm afraid that this is not going to happen. Being able to tell if any arbitrary program will terminate at some point or not is simply impossible. If you'd find a solution for that problem, the Turing Award would be guaranteed

4) See my answer in your other thread here: http://rapid-i.com/rapidforum/index.php/topic,8541.0.html

Regards,
Marco

sgenzer · January 2015

HI Marco -

Thanks for the quick feedback. As far as infinite loop issues, there are tell-tale signs that may be detectable by RM in the log file and hence could prompt RM to kill the instance. For example, there is usually a "WARNING" or "SEVERE" code in the log files when things are not going well. If the log files were to feed into RM (maybe into a RM text mining process that is reading the log files in realtime?), it could be scanning for these flags and, if seen, could kill the process or at least email the client that things are not going well? I know I can put similar tools in my processes such as Handle Exception...maybe that is what we should do?

As for performance, I understand that the time indicated in the RM Cloud window does not show actual process time. That's why I use the log files. However when I ran a process yesterday four ways (on my own Mac Pro, on a Reg instance, on a Large instance, and on a X-Large instance), I got the following results directly from the log files:

Mac Pro (Late 2013 edition with 3.5GHz 6-Core Intel Xeon E5 and 16GB 1866 MHz DDR3 RAM) using RM Studio 6.2 Personal Edition: 9 minutes 59 seconds
RM Cloud Regular: 12 minutes 24 seconds
RM Cloud Large: 12 minutes 24 seconds
RM Cloud X-Large: STOPPED process after 22 minutes 37 seconds as I had already consumed 92 credits!

I can't analyze the log file in the X-Large run to see how far it got for reasons I already talked about.

The process was first a join of 5 repositories into an example set of 175684 examples, and then an x-validation of a Weka J-48 decision tree model. The join takes no time at all; it's the x-validation that takes the time. I know the Weka operators are not parallelized, and hence was not surprised about the Reg and the Large being the same runtime. But I was surprised that they both were slower than my Mac Pro (which is throttled by the 4GB memory limit of my license), and especially with the X-Large taking so much longer on the exact same process. And can you tell us exactly what the specs are on the processors in the instances?

Let's continue on this thread only? Or email me directly if you like. Thanks as always. I do love RM and think RM Cloud has huge potential!!!

Scott

sgenzer · January 2015

oh...another feature request would be to set max runtime for a process for less than 1 hour (esp for testing purposes like above). I would have set all three runs yesterday at max 30 min if I could have.

Marco_Boeck · January 2015

Hi,

1) Your approach to detect problems is an interesting idea, however it is not a general solution which would work in all cases. I'm afraid it would only work in certain cases. So a heuristic like that might kill processes which are perfectly fine and that is unacceptable. You can build a customized process-killer within your own process because you know your process and what could go wrong.

2) All Cloud instances use state-of-the-art Intel server CPUs, however they are set up to be optimized for memory usage instead of computing power as that is what is advertised at the moment. It is highly likely that we will expand the list of available Cloud instances in the future, though.

3) Max runtime less than an hour - noted.

Regards,
Marco

sgenzer · January 2015

That all sounds great. Thanks, Marco!!

Any idea why the X-Large process took so much longer that the Regular or Large with my test run?

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

RM Cloud feature requests

Answers