Macro Propagation into Subprocesses and Execute Process Operators

mdrmdr Member Posts: 7 Contributor II
edited November 2018 in Help
Thanks Simon. I got it working.

I have another question - I posted it on the BI / ETL board as well but no responses. Maybe you can suggest something.

I made a 'Read DB' process that reads the data, Discretizes it and Splits it.
I am now using this 'Read DB' process in several learners.

I am essentially creating a model for each learner using GridOptimize and I want to write out the model, params, results etc in a directory by {process_name}.

So I want to set a path in the main process (Read DB) that is then used by the sub processes. The only way it seems to do this is by putting a 'Set Macro' operator in each of the child processes and setting the path individually there.

Would there be a way to do this where I set it once in the main process and use it in the calling operators?

Thanks.

mdr

Answers

  • SebastianLohSebastianLoh Member Posts: 99  Maven
    Hi mdr,

    I've made a little demonstration process which shows you how to solve your problem.

    Use the Community Plugin and search for "Macro Propagation into Subprocesses and Execute Process Operators" to open the process in RM.

    Ciao Sebastian
  • mdrmdr Member Posts: 7 Contributor II
    hi Sebastian -

    Thank you for the reply. It worked great!

    However I have two other questions now.

    1. Is there an operator that will allow me to start the subprocess AFTER the first (Execute Process) has finished.

    2. I need to set some parameters in the first (Execute Process) that I would like to be used by the SubProcess - since it will be executing after the first terminates.

    I am trying to build a library of processes that interact with each other such as GridOptimizer writes out Performance Vector and Model and then Apply Model comes and picks up the parameters given a dataset based on the name of the Model and such.

    Please let me know what options may be available. I hope I am approaching it the right way.

    Again thanks immensely for all your help. I can update my processes if you like but I am essentially working off the idea you demonstrated.

    Best.

    monosij
  • SebastianLohSebastianLoh Member Posts: 99  Maven
    HI mdr,

    regarding 1): sure, just place the operators in a row and then the first in line is executed before the second (They actually do not need to be connected, just the execution order has to be right).  See also:http://rapid-i.com/rapidforum/index.php/topic,2275.msg8973.html#msg8973

    regarding 2): You might take a look at the Remember and Recall operator. In combination with the macros you can design very complex processes but it is not trivial.

    However, lets not get carried away in the Getting Started Board ;-)

    Ciao Sebastian
  • mdrmdr Member Posts: 7 Contributor II

    hi Sebastian -

    Thanks for your reply. I will try it out. When you say 'place the operators in a row and then the first in line is executed before the second' do you mean in the tree. Or just in the work-flow space? i have difficulty understanding how just placing in a row one after another determines their execution sequence. Also if I - or someone else - experiments with the work-flow they may not leave it in the same order if they are moving things around.

    My manager suggested I try just taking an output from the first process and connecting to the second - like a dummy input? Will that help. My question is inside the second process how will determine which input to use - as I am using 1 input. But that again would be superfluous I feel as if someone disconnected and reconnected the first input its order into the process changes.

    I will take a look at the Remember / Recall operators.
    -------------------------
    On another note: The sub process that I want to start second is an GRID-OPTIMIZE APPLY sub-process as below:

    READ MODEL > APPLY MODEL > PERFORMANCE > WRITE PERFORMANCE

    So it has a READ MODEL operator which I want to read a model based on 'path' and 'optimizer' that I will be setting in the Macro or in the first process. So the READ MODEL has this for model file: %{path}\%{optimizer}.model.dat.

    However the READ MODEL operator cannot seem to read the model based on the path and optimizer macros set outside in the main process. It has to be hard-coded in there, but even then it is sketchy. What definitely works for GRID-OPTIMIZE APPLY process is if it is totally self contained. That is my initial process of READ DATABASE (that starts everything off) also needs to be in GRID-OPTIMIZE APPLY - else the READ MODEL operator in the GRID-OPTIMIZE APPLY process does not work.

    The WRITE PERFORMANCE operator in the GRID-OPTIMIZE APPLY process however does pick up the macros and writes output fine.

    The READ MODEL operator is kind of a standalone operator that feeds the model into the APPLY MODEL operator in GRID-OPTIMIZE APPLY process. Wondering if that needs to be set up in any special way so as to pick up the macros for path and optimizer and apply the right model.

    I hope my explanations are clear. If you like I can also upload the process to MyExperiment - but really the GRID-OPTIMIZE APPLY is a simple process just has READ MODEL > APPLY MODEL > PERFORMANCE > WRITE PERFORMANCE.

    Thanks again for your help Sebastian. And great game from Germany!

    monosij

  • SebastianLohSebastianLoh Member Posts: 99  Maven
    Hi monosij,

    with placing operators in a row i mean connection the output of the predecessor with the input of the ancestor in the Proccess View (View->Show View->Process).
    Then the first operator is executed before the second operator in line. A process is something like a data flow. To explicitly check the data control flow, you can check and change the execution order within the process tab using the button displaying an up-down-arrow with an question mark.

    I suggest to take a look at the tutorial videos http://rapid-i.com/rapidforum/index.php/topic,1750.0.html to get a better idea how designing processes in rapidminer  works.

    Another helpfull source of examples are the processes you can find via the community plugin.

    If you take a look at the "Macro Propagation into Subprocesses and Execute Process Operators" process again, you'll also fin how you can propagate macros into execute process operators.

    However, what you intend to do is really advanced process design and to complex to for the forum support. I suggest to take a look on your Rapidminer Seminars and or consulting offers, where we can help you to solve you individual problems.

    Ciao Sebastian

    P.S. Indeed, it was a great game for Germany. However, I doubt that the krauts gonna beat Argentina. Btw: We developed a data ming application with Rapidanalytics and Rapidsentilizer for the World Cup. Take a look at www.mannschaft-der-herzen.de where you can find news trends regarding the teams. Unfortunately the page is in German thought...
  • mdrmdr Member Posts: 7 Contributor II

    hi Sebastian -

    Thanks for your responses. Will try out some of your suggestions. Any suggestions on dynamically reading the model would be great, if there is an article or such. Will be needing to much more later, with templating the optimizers and such, so will probably need to take a seminar or something you suggest.

    I could not understand the analytics on the WC at cursory glance as they are in German but now I think I believe the Octopus more that Rapid Miner!

    http://www.cnn.com/2010/SPORT/football/07/01/germany.octopus/index.html?hpt=C2

    It even predicts going to extra time!

    Btw I think the German forwards are going to much faster than the Argentinian rear guards if they take the same drugs as against England! Just kidding! The Germans are a fast team!!

    Good luck SAT!!

    monosij
  • BartNBartN Member Posts: 18  Maven
    Sebastian Loh wrote:

    Hi mdr,

    I've made a little demonstration process which shows you how to solve your problem.

    Use the Community Plugin and search for "Macro Propagation into Subprocesses and Execute Process Operators" to open the process in RM.

    Ciao Sebastian
    Hi Sebastian,
    Is it also possible in the other direction. If a macro is not defined in the main process but in the "Execute Process", how can i use it afterwards in the main process?
    The Execute Process does not seem to return the internal macro, even though macros are supposed to be global, right?

    Kind regards,
    Bart
  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869   Unicorn
    Hi, they are only global per process, and do not pass process scopes. If you need the values in the outer process, you could encode them into an example set using Generate Data by User Specification, and pass that example set to the outer process, where you can extract them with Extract Macro.

    Best, Marius
Sign In or Register to comment.