Good morning everyone,
Anyone else noticing the H20 engine with RM7.5 bogging down CPU tremendously, even when not using Deep Learning? I have been experiencing these weird "cycles" where, all of the sudden, my CPU usage goes through the roof for a few minutes while it does some H20 stuff, and then cycles back down. Any way to tone this down?
Hi Scott, Brian,
Is it something that you only experience with 7.5?
Because the high cpu usage is more like a feature.
H2O algorithms are all highly parallel algos, not just Deep Learning. They spin up a cluster to leverage many cores.
The global "Number of threads" setting (Preferences...) in Studio or Server can limit their cpu usage.
no definitely like the usage of cores when I'm actually doing the modeling. The issue I'm having is that they seem to cycle up and down when I'm NOT running a process, or when my process does not have any modeling operators in it (ETL stuff). And when it does cycle up, it really goes whole-hog and practically locks up my machine until it's done.
I can do a video screencapture if you want so you can see the cycling.
Hmm, maybe this is not H2O related at all but might be a result of the new data core... Yes, any additional insights or a video would be highly appreciated.
ok I "caught" RM today doing this cycling thing. No it's not an H20 issue - not showing in the log anyway. Here's a video screencapture with my CPU usage in the foreground. Note that I have no processes running at all. Nothing.
can you do me a favor and look at your .RapidMiner folder and check the file size of "cta.h2.db"?
After you did that and it's large, you can also try closing Studio, then send said file to me. Afterwards delete it from the .RapidMiner folder and see if the problem is gone.
Hi Marco -
No problem - here you go:
And yes, I quit and restart RM all the time. I have a hypothesis on why this is happening. When I run a large process that has some kind of looping in it (e.g. loop operator, cross-validation, optimize parameters, etc...) and I decide to stop the process before it has finished, I have a hunch that the process does not stop - maybe due to its parallel processing somehow? When I click the "Stop" button, I still see the process icon spinning and it will remain spinning until I delete that operator. It will even keep spinning when I start the process again. So I think what is happening in that activity monitor video is that, even though I have "stopped" a process, it's still going.
Case in point: yesterday I was running a process where RM was taking a large data set (2m+ examples, 50+attributes) and creating k-means clusters of various sizes inside a optimize parameters operator. The goal was to optimize the performance of the clusters via cluster density. Knowing that this process was a monster that was going to take several hours, and could likely crash somehow, I had it store the performance of each cluster density using the Store operator inside. Well lo and behold I saw that this was stalling at some point so I stopped it. But my CPU was still going strong and sure enough, several minutes later, I saw another performance pop into my repository. The only way I could really stop this whole thing was to quit and restart RapidMiner. It's sort of like that Monty Python movie: "STOP I say - or I will say STOP again!!"
Does this help?
Loops etc should also stop when you click the stop button. The thing however is, this functionality depends on graceful termination by the operator implementation, i.e. it has to check before doing meaningful work whether the process was stopped by the user or not. Unfortunately, there are operators out there which do not regularly check whether they should stop for various reasons (their code being old, using a 3rd party library we cannot control, etc). If you have such processes and this occurs, feel free to share the process XML with me (private message if you like) and let me know where the process did continue after pressing the stop button.
You cannot safely terminate a thread in Java (because it's highly dangerous and may leave other things in undefined states because a method call was interrupted mid-execution), which is why the above will remain a problem.
If you can, please also upload the cta.h2.db file for me and send me a download link (again via private message), that would be tremendously helpful.