The RapidMiner community is on read-only mode until further notice. Technical support via cases will continue to work as is. For any urgent licensing related requests from Students/Faculty members, please use the Altair academic forum here.
Optimize Selection Evolutionary, Parallel -Scaling
hughesfleming68
Member Posts: 323 Unicorn
Has anyone done any tests to determine how well Rapidminer 7 scales on multicore cpu's? Partiularly machines with 16 threads or greater?
Many thanks,
Alex
0
Answers
Hi Alex,
One of the best experts for this would be @land as his company developed an extension specifically for parellizing efficiently in RapidMiner.
I think you spoke with him on the forum recently. I'm sure he has some good information in that area.
Regards,
John.
Thanks John,
I will ask him. I will also try and setup a test myself over the next couple of weeks.
regards,
Alex
in principle you need to consider that each thread needs a copy of the data, so your memory should match your CPU count.
The easiest way is to use multiple threads for the cross validation, this directly results in nearly x-times speed up.
However, as one usually uses a 10 fold cross validation (I make it usually 8 to match my cpu cores) this speedup is limited. If you need to utilize more threads, you also need to run outer operators in parallel.
I usually find myself to avoid this and rather have multiple processes running in parallel. One usually does not only use ONE single optimization run, but have multiple for multiple methods. This way you can easily bring down also bigger servers.
And of course real world projects usually not just need one model but usually multiple ones. So you can also loop over groups of data and calculate their models in parallel.
We offered the Jackhammer Extension until recently that did add a lot of the necessary functionality.
Greetings,
Sebastian