Change the Execution Order of Processes

MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,503 RM Data Scientist

Question

RapidMiner is always executing one operator at a time. How can I change the order?

Answer

Changing the exectution order is usually not necessary. There are only a few cases where you need to do it:

 

  • An operator needs the result of a former operator which cannot or should not be connected (e.g. Remember or Extract Macro)
  • You would like to inspect the result of one operation first (using e.g. breakpoints)


To do this you can click on the small blue icon on the upper right most edge of the Process Panel

1.png

Once you clicked on it you see the real execution ordering. You can not right click on the numbers

3.png

to make the execution of the operator as early as possible.

 

You can also change the ordering manually by clicking on the first operator (here: Multiply) and then on the operator which should be excecuted next (here: Validation).

- Sr. Director Data Solutions, Altair RapidMiner -
Dortmund, Germany

Comments

  • TripartioTripartio Member Posts: 37 Maven
    I had a bug that was driving me crazy: I had two identical processes that were giving slightly different results. I checked over and over again that the random seeds were identical and that every single operator was running the latest version of RapidMiner (no compatibility mode anywhere). The two processes were created separately feeding from the same source data file. The only difference that I could see was that the operators were not in the same identical positions, but I do not think that could be the cause for the slightly different results. The results were just close enough that I could tell that it was surely a random seed issue, but as I said, the seeds were identical across the two processes.

    Finally, to debug the issue, I copied the XML versions of each processes and ran them through a diff checker . As expected, the X-Y positions of most operators were slightly different. But what popped out to my surprise was that some operators were not in the same order. So, I used the kinds of principles in the post above to order the two processes identically. And voilà, the results became identical.

    I'm leaving this comment here in case anyone runs into the problem that I did: even when everything else is identical, including random seeds, different orders of operator execution can cause slight (random) differences in results between two RapidMiner processes.

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,503 RM Data Scientist
    I am sorry that you experienced this. There are of course cases where the difference in execution order make difference. Usually those are related to macro usage, but not to random seed. If every operator who uses a random number generator uses it's own random seed it should be no problem. Only if they are accessing the global generator order makes a difference.

    BR,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • TripartioTripartio Member Posts: 37 Maven
    @mschmitz, why would there be a difference if each operator uses the default random seed for the process? (In my case, I use the default seed 1234 and none of the operators uses their own custom seed.) That is, if all operators use the same default random seed, why would order of execution make a difference in the results? Is the global seed value changed when the random number generator is invoked?
  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,503 RM Data Scientist
    Hi,
    every process has one global random generator. By default this is being used.

    Lets say you have 2 operators needing random numbers, and each need 3 of them. The generator would provide them, maybe those:
    1,2,3,4,5,6

    If operator A is the first operator in this list it gets the 1,2,3. If its the 2nd one to be executed it gets 4,5,6. So those differ.

    If every one uses its own seed then there should be no problem, since they all invoke a new generator (or set them to its starting seed).

    Best,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.