Scheduled tasks nut executing - Error message is "Job Service seems to not be Available"

adamfadamf Member Posts: 34 Contributor I
edited December 2018 in Help

I am getting an error when trying to run scheduled tasks on the free edition of RM Server.  The pertinent error message seems to be: “The job service seems to not be available; please check your setup”.

 

Do I need to execute/start any other programs or Windows services in addition to the RM Server Windows service?  I tried starting the job service from a CMD.EXE window by running the batch file "rapidminer-jobagent", which seems to have completed successfully.

 

 

Tagged:

Answers

  • adamfadamf Member Posts: 34 Contributor I

    I wonder if this screen-shot may point to part of the problem.  I tried to schedule a new task on the server from RM Studio, but it tells me that no Queues are available.  When I check Queues on the server from the Processes menu, I get the message "Unable to Load Queues".

  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    hello @adamf - welcome to the community. Are you sure you're not running more than one process at the same time? The free license of RM Server is restricted as you can see from the error message.

     

    I'd also advise going through the docs for RM Server: https://docs.rapidminer.com/latest/server/how-to/

     

    Scott

     

     

  • adamfadamf Member Posts: 34 Contributor I

    Hi Scott.  Yes, I do not have more than one process scheduled at the same time.  The scheduling is staggered so that the processes don't overlap.  Moreover, when I try to schedule a process from RM Studio, you can see that the dialog tells me: "No Queues Available" (see attached screenshot).  Perhaps something is wrong with the configuration, but I have installed and reinstalled RM Server twice successfully.  Do I need to create/manage queues outside of the RM Server installation process?

  • JessForbesRMJessForbesRM RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 38 RM Data Scientist

    Hello,

     

    It seems like your job agent either isn't installed or isn't running.  

     

    You can check the log specific to the job-agent by looking at your rapidminer folder  \job-agent\logs - do you see any errors explaining why the job agent isn't started?

  • adamfadamf Member Posts: 34 Contributor I

    Hi Jess,

    Indeed, the job-agent log does report the following error, but I am unclear how to correct it: 

     

    ERROR 1936 --- [defaultTaskExecutor-3] o.s.j.l.DefaultMessageListenerContainer  : Could not refresh JMS Connection for destination '__agentCommand' - retrying using FixedBackOff{interval=5000, currentAttempts=42931, maxAttempts=unlimited}. Cause: Could not connect to broker URL: tcp://For-Win7-pro-PC:5672. Reason: java.net.ConnectException: Connection refused: connect

     

  • JessForbesRMJessForbesRM RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 38 RM Data Scientist

    Hey Adam,

     

    Looks like java isn't allowing the connection.  I saw something similar when I didn't notice a security pop up asking my permission the first time I installed and launched.  To correct it I tried the following.

     

     

    1- Cleared java cache and installed applications - https://www.java.com/en/download/help/plugin_cache.xml

     

    2- Ensure java is allowed through firewall - inbound and outbound - see attached image from the firewall control panel on windows 10

    JavaFirewallRules.JPG

     

    3 - Start job agent manually from job-agent\bin\rapidminer-jobagent to ensure it starts and can be seen inside of server gui and then restart server service to ensure it starts up in tandem with the server as designed

  • adamfadamf Member Posts: 34 Contributor I

    It appears that the Firewall rules do allow Java as per your screen shot.  I cleared the Java cache.  Same result.  I noticed in the netstat command that port 5672 is not showing up.  Is the server supposed to be listening on that port?  The RM Server service is running when I try to start the job-agent (via rapidminer-jobagent.bat), but the latter continues to report the same error.

  • JessForbesRMJessForbesRM RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 38 RM Data Scientist

    Hey Adam,

     

    I've been kicking this around for a bit and I think that error may actually be a red herring, in my functional server it appears to have that error repeatedly before it shows the connection succeeding.  

     

    2018-01-23 09:54:25.693 ERROR 11736 --- [taskScheduler-3] c.r.e.jobagent.scheduled.JobAgentTasks : Could not send Job Agent status message: Exception while sending job agent status message
    2018-01-23 09:54:26.898 ERROR 11736 --- [tTaskExecutor-3] o.s.j.l.DefaultMessageListenerContainer : Could not refresh JMS Connection for destination '__agentCommand' - retrying using FixedBackOff{interval=5000, currentAttempts=7, maxAttempts=unlimited}. Cause: Could not connect to broker URL: tcp://RMUS-JFORBES:5672. Reason: java.net.ConnectException: Connection refused: connect
    2018-01-23 09:54:29.635 ERROR 11736 --- [tTaskExecutor-1] o.s.j.l.DefaultMessageListenerContainer : Could not refresh JMS Connection for destination '__agentCommand' - retrying using FixedBackOff{interval=5000, currentAttempts=8, maxAttempts=unlimited}. Cause: Could not connect to broker URL: tcp://RMUS-JFORBES:5672. Reason: java.net.ConnectException: Connection refused: connect
    2018-01-23 09:54:30.700 ERROR 11736 --- [tTaskExecutor-2] o.s.j.l.DefaultMessageListenerContainer : Could not refresh JMS Connection for destination '__agentCommand' - retrying using FixedBackOff{interval=5000, currentAttempts=8, maxAttempts=unlimited}. Cause: Could not connect to broker URL: tcp://RMUS-JFORBES:5672. Reason: java.net.ConnectException: Connection refused: connect
    2018-01-23 09:54:32.035 INFO 11736 --- [tTaskExecutor-3] o.s.j.c.SingleConnectionFactory : Established shared JMS Connection: ActiveMQConnection {id=ID:RMUS-JFORBES-58898-1516719214197-1:36,clientId=812e5b4d-1238-41b1-b9ac-7074e817749a,started=false}
    2018-01-23 09:54:32.048 INFO 11736 --- [tTaskExecutor-3] o.s.j.l.DefaultMessageListenerContainer : Successfully refreshed JMS Connection
    2018-01-23 09:54:34.636 INFO 11736 --- [tTaskExecutor-1] o.s.j.l.DefaultMessageListenerContainer : Successfully refreshed JMS Connection
    2018-01-23 09:54:35.701 INFO 11736 --- [tTaskExecutor-2] o.s.j.l.DefaultMessageListenerContainer : Successfully refreshed JMS Connection
    2018-01-23 09:54:35.973 INFO 11736 --- [tTaskExecutor-2] c.r.e.j.queue.CommandMessageConsumer : Activation from Job Service arrived. Starting job message listener.
    2018-01-23 09:54:35.974 INFO 11736 --- [tTaskExecutor-2] c.r.e.j.queue.CommandMessageConsumer : Job message listener activation finished.
    2018-01-23 09:54:37.052 INFO 11736 --- [tTaskExecutor-3] o.s.j.l.DefaultMessageListenerContainer : JMS message listener invoker needs to establish shared Connection
    2018-01-23 09:54:37.052 INFO 11736 --- [tTaskExecutor-3] o.s.j.l.DefaultMessageListenerContainer : Successfully refreshed JMS Connection

     

    I would recommend bringing your server down, saving off the server logs - both the standalone log and the job-agent log.  Ensure there are no currently running java proccesses by killing off any listed in your process list.  Once youve made sure there are no running/wedged java procs restart the server.  Your job agent should be started as well, give it about 3-5 minutes and take a look at the job-agent log if it is still not working. 

     

    Unfortunately you may need a re-install in a fresh file location on your server if this doesn't fix it. 

  • adamfadamf Member Posts: 34 Contributor I

    Thanks Jess.

    One thing I am still not clear about is whether the job-agent is supposed to start automatically (when the RM Server Windows service starts) or manually as well as when it is supposed to start.  Previous posts suggested that the job-agent was "unavailable".  So, I've been trying to start in manually by running the corresponding BAT file (with Admin privilege).  It is during the BAT file execution in the CMD console window that I see the connection errors.  Note also that according to NetStat command, no process appears to be listening on port 5672 (the port that the job-agent error messages indicate its trying to connect to).  Is it accurate to assume that the Server process, which starts automatically at system startup, should be listening on port 5672?  Is the Server process supposed to also start the job-agent and create the Default queue?  Seems odd to me that the job-agent has to be started manually, particularly since the server cannot schedule/run jobs without it.

  • JessForbesRMJessForbesRM RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 38 RM Data Scientist

    Hey Adam,

     

    For me the job agent starts automatically but I don't have the server installed as a service - for troubleshooting purposes - what happens when you launch the server as a bat file instead of starting the service? 

     

    The standalone.bat file is what I use to launch my own testing environment, when I launch it the job-agent starts in a seperate window without any other interaction required from me.  

  • adamfadamf Member Posts: 34 Contributor I

    Hi Jess,

    I started the server using the BAT file.  Same result, although I did notice that the server BAT file did launch the job-agent BAT file.  So, at least if it were successful, it would have started both the server and the job-agent.  

    It appears that the Server BAT file reported several errors.  The last message in the console window is:

    "JBAS014777: Services which failed to start: service jboss.web.deployment.ra-host./executions"

    Meanwhile, the job-agent console window has been complaining for an hour that it:

    "Could not connect to Broker URL" on port 5672 with reason "java.net.ConnectException: Connection refused: connect"

    I checked NetStat again, and no process is listening on port 5672.  Note also that I am using the free version of the RM Server.  Is that what you're using?

  • JessForbesRMJessForbesRM RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 38 RM Data Scientist

    Hey Adam,

     

    It sounds like you are due a clean install. Shut your server down, shut down any java and jboss processes - restart just to make sure you do not have any existing wedged instances already running.  

     

    I searched your jbas error here on the community and found it is not that uncommon to see it if your install is damaged, the error is stating your jboss is not able to start which explains why the process isn't able to listen, it cannot start.   here is a thread showing a potential solution - https://community.rapidminer.com/t5/RapidMiner-Server-Forum/RapidMiner-Server-not-starting-up-any-more/td-p/32090

     

    You can attempt the ear removal as instructed in the other thread or you can bypass that an simply create a new install location and a fresh install.  I recommend a fresh install as its the most reliable.  

     

     

  • adamfadamf Member Posts: 34 Contributor I

    I will reinstall - this will be at least the third time.  I follow the installation instructions and the installation process seems to complete successfully.  I have been using the same backend database that was used for the previous version of the server.  I wonder if that could be causing the problem.  I assume that the database has all of my scheduled tasks.  So, I'd have to recreate all of them if I wipe the database before doing the reinstall of the server, although it is not clear to me that the database would have anything to do with java processes not starting.

    I am using the rapidminer-server-installer-8.0.1, which I assume is the latest server installation package.

  • adamfadamf Member Posts: 34 Contributor I

    Hi Jess,

    The "Clean Install" produced the same results with the same errors.  However, I next tried a clean installation of RM Server AND a new "clean" backend schema for the server's database.  Curiously, using the database from the previous version seems to have been a/the culprit!  I cannot explain exactly why, but once I pointed RM Server to a new, "clean" database during the installation process, I was able to schedule processes once again!

    I find it difficult to imagine that NO ONE who has upgraded to the new RM Server has not run into the same issue (i.e., using a database from a previous version of RM Server).  I wonder if I missed a step in the installation instructions that said I must use a new, empty schema during the upgrade process?  If not, I would think it should be noted in the installation instructions; or, better yet, it would seem that a database from a previous version of RM Server SHOULD be allowed without causing conflicts and errors? 

    Of course, by starting with a new empty database, I had to manually recreate all of my scheduled jobs.  The scheduler ran them this morning, which is major progress.  However, they all failed because I forgot that I must copy the RM Extensions in TWO places for the server: 1) the server's extensions directory, and 2) the job agent's extensions directory.  (I suppose I could have pointed the job agent's extensions path to the same location as the server's).

    With the extensions copied, I am hopeful that the scheduled tasks will execute without error tomorrow morning.

     

  • JessForbesRMJessForbesRM RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 38 RM Data Scientist

    Hello Adam,

     

    Unfortunately issues with the DB can occur during upgrade, they are uncommon but that is why we include backing up your database as a step in our documentation for performing an upgrade to 8.x

     

    https://docs.rapidminer.com/latest/server/administration/upgrading/updating-rms-to-8_0.html#full-upgrade

     

    You should be able to recover any tables you need from your backup that was taken during the first upgrade.  

Sign In or Register to comment.