🎉 🎉 RAPIDMINER 9.10 IS OUT!!! 🎉🎉
Download the latest version helping analytics teams accelerate time-to-value for streaming and IIOT use cases.
"Controlling loops (break/continue)"
I experienced that the clear view of a process can quickly be lost if there are some nested loops or branches. In some cases I would have been happy if the "Branch" operator was a simple one instead of a super operator, delivering the input data either to a 'then' or an 'else' output port. This way you wouldn't have the chance of combining or delivering modified input data for each port, but in most cases this wasn't necessary for me. This could be a simple and clear switch for a different flow of data dependent on the condition and perhaps an alternative to the usual "Branch" operator for simple decisions. But this was just a thought coming to my mind a few times during process design...
My actual question is something different: I don't know how the different loops are internally translated into Java code, but they should make use of one of the language's standard methods, I guess. Is there a way of controlling loops in RapidMiner by calls as they are possible in Java (continue/break)? Or does this conflict with the process structure of RapidMiner? Before trying to add some (hopefully simple) operators for these tasks I wanted to make sure if it's possible at all.
Maybe you also have some alternative suggestions. In the current case I am using "Loop examples" on a list of URLs, retrieve each page via "Get Page" and then follows some information extraction. I already had to add one "Branch" after the "Get Page" to avoid that the process fails if a single page wasn't retrieved properly (due to connection problems or something else). Now there are some cases that make the following XPath interpreter abort the process due to invalid XHTML code. In this case the page doesn't contain useful information and the the current loop iteration can stop at this point. Instead of using another super operator and putting the major part of the process inside it, I would prefer a simple single operator or something similar to skip/stop the iteration without any result. If I am just eliminating the error sources (as I did for now) this results in mostly empty examples that have to be filtered out later.
I hope my idea and question becomes understandable, but perhaps I am just thinking into the wrong direction and someone wants to point me towards a proper solution
P.S. The only related question I found was in an older topic (http://rapid-i.com/rapidforum/index.php/topic,892.0.html) which didn't provide a real solution.