Why a major version?
RapidMiner Server 8.0 comes with a brand-new architecture, delivering some exciting changes: full scalability, more useful queues, improved resource management, a nicer UI and a lot of changes that will open many new use cases. This represents a big leap in what RapidMiner Server can do for you.
So, what does the new architecture look like?
Each blue box represents a separate machine. The big box to the left represents the central RapidMiner Server, which provides the web UI and receives all the user requests. All this is done by the RapidMiner Server central node:
The Job Agents (in dark blue) are the new kids on the block. They can be deployed in remote machines or locally in the central node and have to be configured to point to one queue each. The Job Agents check their queue and pick up any pending jobs, spawn a Job Container and execute the job (the user process designed in Studio). This means increased scalability and resource sharing and management among users or projects.
Although scalability is the main new feature, it's still possible to run RapidMiner Server in a single machine with one (or more) local Job Agents executing the jobs. Even if you run everything in a single machine, the new architecture will provide better fault tolerance and improved reliability.
More about each component. What should I install?
There are two components:
You will be able to download it from our website. The installation process is equivalent to what you know from the older Server versions. During the installation, you will be able to select whether you want to have a local Job Agent or not. And, if so, what resources will be dedicated to it (memory and CPUs).
If you already have a Server running, you can migrate it to the new version. There will be two migration options:
Potentially, you could have several local queues and Job Agents, but each will take up its share of memory. That kind of configuration could be good if you have a big machine that you want to split in a logical way to share its resources among user groups or applications.
You can download the Job Agent from our downloads page. It is a zip package that you need to decompress wherever you want it to run. You need to edit its configuration file to point it to the right Server and queue, but alternatively, every time you create a new queue in the Server’s UI, a new link will appear to download the configuration and you can directly copy and paste in the Job Agent’s folder.
How does it work?
When a user schedules a process from Studio or from the Server's UI, the process is placed into the corresponding queue. Any of the Job Agents connected to that queue can pick up the work and run the process. The RapidMiner Server (and the user, through the UI or Studio) gets notified and logs become available.
The process is fully executed in the Job Agent. It connects to the repository, external data sources or whatever is needed for the process independently. There is no data flow from the Server to the Job Agents.
Queues and Scheduling
Differently from what happened in previous versions, queues are now linked to Job Agents. Queues have user permissions and sending a process/job to a queue determines which Job Agents will work on it or how many resource will be available. Many processes can be run in parallel if there are enough free resources, but a single process is always run by a single Job Agent.
If no free resources are available when a process is scheduled, it waits in the queue until it's picked up by an available Job Agent.
What doesn't change
Only processes launched or scheduled from Studio or from the Server's GUI are executed in the Job Agents. Jobs requested through Web Services, Web apps or triggers are not affected by the architecture change and they will continue to run in the central RapidMiner Server.
In a nutshell, these are the most noticeable differences from the old version:
RapidMiner 8.0 is a big step for scalability and management, but we are just getting started! Take a look at this other post in our company blog. There are more architectural issues that we want to address, like moving the web-services executions to the Job Agents, improving latency and performance, going for a fully highly available environment, and much more. Stay tuned!
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.