🎉 🎉. RAPIDMINER 9.8 IS OUT!!! 🎉 🎉
RapidMiner 9.8 continues to innovate in data science collaboration, connectivity and governance
[SOLVED] Flexible Learner Replacement
I'm fairly new to rapid miner, but had to get into it pretty fast due to my new job.
Currently I'm working on predictive analysis for maintenance issues.
My data set contains some ten thousand examples with about a thousand attributes.
Due to speed issues, I've designed a selective preprocess where I split the data into subsets of different attributes, to do some forward selection analysis combined with cross validation and keep the attributes of each subset which have the biggest impact on the result. Then join the results back together to do a final analysis. This process currently involves the usage of about 20 Learner Operators , i.e. SVM, Naïve Bayes or Decision Tree (Process will be shown in the following post).
Switching from one Learner Method to another is a dull and tiring thing, since I have to replace all of the 20 Operators.
So I thought of some kind of macro like 2-Component System for flexible Learner replacement.
These 2 Components could look like:
The first Component should be is a nested Operator, which contains the Learner to be used. It might also needs an ID/Name as parameter for the purpose of running several of the Container Constructions in one Process.
The second Component is linked to the specified first Component (via the ID/Name). It simply retrieves the defined Learner Operator which all its Parameters.
This would also come in really handy, when you want to do some optimization on the whole process (which is my second point of the idea). Regarding a Learner with 3 parameters and 2 choices for each of the parameters, it would make a difference of 2^3^20 combinations - no container available, so each of the 20 Learners has got its own set of parameters - to 2^3 combinations due to the usage of just one Learner configuration throughout the whole process. This would not only save computation time. It would also save time designing the process for not having to choose and set 60 parameter ranges instead of only 3 in the optimization Operator (not regarding the debugging).
I believe that there is some kind of workaround for the problem of too many combinations using macros, but this would probably make the design phase even more complicated and tiring.
Or maybe there is a cool solution using the XML-Code and a replace-function instead of the GUI .