Rapidminer process's CPU consumption after stopped, is equal to when running???

ruserruser Member Posts: 40 Guru
edited November 2018 in Help
I have the following operators in my process:
- DatabaseExampleSource
- Nominal2Binominal

I found that the Nominal2Binominal was taking too much time because of the bigger table (more than 1.5 hour). Hence, I abrupty stopped the process. But, what I observed was that the CPU is always at the higher utilization percentage (50% by the javaw.exe, i.e. the Rapidminer process), even after the process was stopped. This is the same CPU consumption that I observed when the process was running.

I'm totally puzzled. What is Rapidminer doing even after the process is stopped. I could reproduce it multiple times. It becomes normal, only if the close and reopen the Rapidminer. Closing and opening another process file did not help, and I had to close and restart the Rapidminer completely.

Does anybody have an answer? I have the Rapidminerv4.4 installed on my Windows-XP PC.

Answers

  • fischerfischer Member Posts: 439 Maven
    Hi,

    if you look at the log files you will see that RapidMiner reports that the process will always complete the current operator in the background when stopping a process. This is because the only way to stop a thread immediately in Java is Thread.stop() which has been deprecated ever since I know Java :-) (for good reasons). Some operators check whether or not the process was stopped in their innermost loop so they can react more quickly, but for technical reasons this is not easily possible for the BinaryToNominal operator.

    Cheers,
    Simon
  • Stefan_EStefan_E Member Posts: 53 Guru
    Hmm... understand - was stumbling over the same behavior with SVMs.

    That's actually quite nasty, because you want to do a grid search for C and gamma, but for some values, the SVM process doesn't seem to converge. So the only way out is go kill RM.

    Similar with some plotters: Either accidentally or out of negligence selecting some of them takes forever if you have many datapoints (just hit 'parallels' on some 30k examples - not a good idea!)

    Stefan
  • dragoljubdragoljub Member Posts: 241 Maven
    I agree,

    There should be some way to kill the currently running process, even if its just a plotting operator. For example, scatter matrix on a moderate amount of samples will usually crash the program, for memory reasons it seems.

    -Gagi
Sign In or Register to comment.