RapidMiner 9.7 is Now Available

Lots of amazing new improvements including true version control! Learn more about what's new here.

CLICK HERE TO DOWNLOAD

job container forcibly killed

rur68rur68 RapidMiner Certified Analyst, Member Posts: 8 Contributor II
i run job modeling in rapidminer server. after all the operator sucessfully finished and the result sucesfully stored and correct, the process end with error state
"Job container '1' was killed forcefully and therefore the job execution has been stopped. Reason: Restart of job container has been invoked".
i didnt change the behaviour of container restart policies so it still default.
Has anyone else encountered anything similar? Any suggestions on to diagnose the issue? 

Best Answer

Answers

  • jpuentejpuente Employee, Member Posts: 41  RM Product Management
    Hi. Job containers are configured by default never to restart. If they do, it's typically because of some problem with the job. Is there any other problem in the log? Does this always happen with that job? 
  • rur68rur68 RapidMiner Certified Analyst, Member Posts: 8 Contributor II
    hi, @jpuente . thank you for your response, actually i already solved this.
    i got many warning "matrix is singular" in the log. it's probably because of my data that im trying to predict. i exclude the problem and then it run well.
    but, this eror keep coming after i did upgrade to rm server 9.6. some of my job that was ok in the previous version is end with error state like this. i don't know whats going on, is 9.6 version has a "warning sensitive" like this?

  • jpuentejpuente Employee, Member Posts: 41  RM Product Management
    Hi. No change that should have change behaviour that way. We could try to dig a bit deeper if you sent the agent config file and the full log.
  • rur68rur68 RapidMiner Certified Analyst, Member Posts: 8 Contributor II
    hi, @jpuente here's the agent config file and the log.
    fyi, the previous version i used is 9.0. and this is not the only job causing job container killed, i have another one job that always end with error state like this but the result is sucesfully stored.
  • jpuentejpuente Employee, Member Posts: 41  RM Product Management
    It looks like the JC becomes unresponsive right after completing the job. I'll share internally and see what we can find.
  • rur68rur68 RapidMiner Certified Analyst, Member Posts: 8 Contributor II
    hi thank you for your answer
    1. unfortunately it's not possible to try 9.7.1 Server/AI Hub by now. but, what's fundamental change in architecture of this versions?
    2.  i think we didnt have problem in network because it run well on others job
    3. also not the memory, i already increased the memory
    4. this is the only option i can do and i already did and it works. but still confuse why it run well in version 9.0 but 9.6 got some errors like this.
    anyway thank you very much @aschaferdiek

  • aschaferdiekaschaferdiek Employee, Member Posts: 47   RM Engineering
    There's no fundamental change in architecture from 9.6 to 9.7.1, but it's always a good idea to have the latest version running. :) From 9.0 to 9.6 there is a fundamental change, Job Agent and Job Containers communicate internally via HTTP/REST. In 9.0 there was no such communication, so this couldn't pop up because there was no link between them at all (Job Container just started as a separate and entirely standalone OS process).
    Glad that changing the properties helped. Due to the fact that this helped, it's still very likely that it's some weird networking/machine problem. The timeout message still suggests that. I know, we cannot be sure, but there's no other reason why a simple HTTP request would timeout on localhost otherwise.
    Thank you for taking the time to try this out together with me, we'll consider increasing the defaults here.
Sign In or Register to comment.