RapidMiner best practices for multiple customers/models

divya_das · April 2019

Hi,

What are some of the best practices which we can follow when we use RapidMiner server for multiple customers.
Suppose I have a RapidMiner server and the training and prediction processes are exposed as web services. The processes have been parameterized(using macros) to dynamically handle training/prediction for different customers. There are two models per customer.
1. How to manage when the number of models becomes huge. Currently, we have one folder per customer.
2. How to handle the load when we have multiple prediction processes being invoked. Should we use multiple RapidMiner servers? Is there any scaling mechanism ( auto-scaling) to scale the training/prediction process horizontally?

Regards,
Divya

sgenzer · April 2019

hi @divya_das -

I'm not the resident expert on RM Server deployment solutions but here are some thoughts for you...

1. When you say the number of models become huge, do you mean that all these models are in production or some of them are legacy models? If it's the latter, I would certainly archive older models in some storage solution - the models are just file objects that you can archive anywhere.

Otherwise if it were me I'd start making subfolders (you can even automate their creation using Create Directory) I guess...can you help me understand why there are so many models when there are only two per customer?

2. For scaling yes, you only need one RM Server but you should create unique job agents to scale outwards (see this docs page for an overview of scaling architecture in RM Server) and to handle the load, you can upgrade to High Availability load balancing.

Scott

divya_das · April 2019

Hi Scott,

Thanks for the ideas. I will go through the articles you have mentioned.
We are planning for multiple use cases that will be in production. Say one for linear regression, one for logistic regression etc. We will have one model per customer for each use case. For now, we are planning to keep one folder for each customer. But, what if we have 100 customers, then we will end up with 100 folders.

Thanks,
Divya

MartinLiebig · April 2019

Hi @divya_das ,

i don't see any reason why 100 folders would be an issue? It's in my opinion the most straight forward way to do this. Any other solution (like a HashMap of models) requieres additional load when loading it.

Best,

Martin

divya_das · April 2019

Thanks Martin. I will go with the folder approach, one folder for each customer.

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

RapidMiner best practices for multiple customers/models

Answers