RapidMiner

Is it possible to loop over processes in a repository?

SOLVED
Elite II

Is it possible to loop over processes in a repository?

[ Edited ]

Let me give you the context of my question. I'm adapting Fabian Temme's (or Thomas Ott's? I wasn't clear) nice blog entry on Model Management.  

 

I have a collection of processes in a folder named "Algorithms". All of them are processes that take the same dataset and produce the same outputs (model, performance). 

 

Screen Shot 2017-05-21 at 12.14.10 PM.png

 

I would like to write another process that would loop over the repository of processes and execute them all. Currently my process looks like this:

Screen Shot 2017-05-21 at 12.17.02 PM.png

I want to extend my example to 15-20 different processes with different parameters. Any chance I can loop thru them? I tried "Loop Repository" but couldn't make it work. 

 

Thanks in advance for any help,

 

\E

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Elite
Solution
Accepted by topic author earmijo
‎05-22-2017 11:46 AM

Re: Is it possible to loop over processes in a repository?

What if you create an example set (like an external csv or so), which has your process names as values, and you loop through that one? Using an extract macro you can then get the name of your process, and use that as an input to call your process. You could either call it directly or use it as a trigger in a branched process loop.

Not very elegant but it should do the job I believe.

 

 

 

4 REPLIES
Elite
Solution
Accepted by topic author earmijo
‎05-22-2017 11:46 AM

Re: Is it possible to loop over processes in a repository?

What if you create an example set (like an external csv or so), which has your process names as values, and you loop through that one? Using an extract macro you can then get the name of your process, and use that as an input to call your process. You could either call it directly or use it as a trigger in a branched process loop.

Not very elegant but it should do the job I believe.

 

 

 

RMStaff

Re: Is it possible to loop over processes in a repository?

Hi \E,

looping over processes can only be done with a workaround. Unfortunately, there is no built-in solution for that.

 

The attached process works if you have the processes on a local filesystem - Not on RapidMiner Server.

Do not store the process in the folder you want to execute - otherwise you might run into issues.

 

Best,

Edin

 

<?xml version="1.0" encoding="UTF-8"?><process version="7.5.001">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="6.0.002" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="concurrency:loop_files" compatibility="7.5.001" expanded="true" height="82" name="Loop Files" width="90" x="112" y="34">
        <parameter key="directory" value="C:\Users\EdinKlapic\.RapidMiner\repositories\Local Repository\Community\crawl web"/>
        <parameter key="filter_type" value="regex"/>
        <parameter key="filter_by_regex" value=".*\.rmp"/>
        <parameter key="enable_macros" value="true"/>
        <process expanded="true">
          <operator activated="true" class="generate_macro" compatibility="7.5.001" expanded="true" height="82" name="Generate Macro" width="90" x="112" y="34">
            <list key="function_descriptions">
              <parameter key="file_name" value="replace(%{file_name},&quot;.rmp&quot;,&quot;&quot;)"/>
            </list>
          </operator>
          <operator activated="true" class="productivity:execute_process" compatibility="7.5.001" expanded="true" height="82" name="Execute Process" width="90" x="246" y="34">
            <parameter key="process_location" value="crawl web/%{file_name}"/>
            <list key="macros"/>
          </operator>
          <connect from_op="Generate Macro" from_port="through 1" to_op="Execute Process" to_port="input 1"/>
          <portSpacing port="source_file object" spacing="0"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_output 1" spacing="0"/>
          <portSpacing port="sink_output 2" spacing="0"/>
        </process>
      </operator>
      <connect from_op="Loop Files" from_port="output 1" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>
Highlighted

Re: Is it possible to loop over processes in a repository?

to make it easier what @kayman suggested, you can use some operators from operator toolbox extension , please explore the "create exampleset" -- here you can add a list of process you want to execute

Also there are operators to set parameters using exampleset, set macros using exampleset,

 

Elite II

Re: Is it possible to loop over processes in a repository?

Kayman, Edin, Bhupendra:

 

All of your suggestions work perfectly. Thank you very much for your time. 

 

Regards,

 

\E