You might have read our previous blogs (here for a general motivation in our company blog and here for a more technical description) about the new architecture in the upcoming RapidMiner Server 8.0 release. One might think this is just an internal technical improvement, but not quite, this change has deep consequences in the many ways you can use RapidMiner Server. That’s why we’ve moved the needle, said goodbye to the 7, and happy to welcome 8.0!
The most obvious way in which RapidMiner environments will change with this new release is the option to scale out. If your computational needs exceed that of a single machine, now you can deploy multiple RapidMiner Job Agents across multiple machines or VMs and leverage all your resources. Now your environment can scale both vertically and horizontally.
Job Agents are connected to queues and are constantly polling and asking for something to do. This way, RapidMiner can work in a grid-like fashion, sending jobs to free resources that can work on them.
Adding some structure: the new queues
That grid-like architecture would be a basic configuration with all the available nodes connected to the same queue. But that’s not the only option. In RapidMiner 8.0, queues have acquired a new meaning.
Each Job Agent can pick up jobs from only one queue, but multiple Job Agents can connect to each queue. With this ‘one queue to many agents’ relationship one can effectively configure sub-clusters that can serve different purposes. This is a great tool for administrators to achieve good resource management.
For example, different teams can have their own sub-clusters, but they can also share a common one. Or, within a group, there might be a standard queue and a high-priority one where only certain users or applications are allowed to send jobs.
Another option is to split the cluster depending on the needs of the user processes. Typically, one would send big training processes to a large machine with enough memory (“training queue”), while lightweight scoring processes go to another sub-cluster with less memory, but maybe more CPUs to take advantage of parallelization (the “scoring queue”).
One could also have Job Agents specialized in certain extensions with particular needs, like Keras (Deep Learning), which has specific installation pre-requisites.
Reliability and Fault tolerance
By the way, this is all about having local or remote dedicated resources for processes, which gives us an interesting and powerful feature: now that everything runs independently and no process going amok will affect others, we get a highly reliable and robust system.
Another interesting side effect is an increased fault tolerance, especially in the execution pieces (the Job Agents). They are set to be fault tolerant by default as soon as more than one Job Agent is connected to each queue. If, for any reason, any one of them fail, another Job Agent will continue picking up jobs from the same queue and users will not be affected. Only the job that fails will be lost.
RapidMiner Server 8.0 is just a first step. We still have a lot in store for future releases, like full high availability, a centralized configuration and improved UI. Stay tuned!