Options

Parallel execution of the

maciekmaciek Member Posts: 13 Contributor II
edited June 2019 in Help
Dear Community,

my question is - how (or is this possible) to achieve parallel execution of the "Execute R" operator inside a "Loop" operator.

I have a Collection of example sets and for every example set I want to perform some calculation using the "Execute R" operator. I enable the parallel execution checkbox expecting that it will run Rscript.exe in parallel. However, it is not.

I attach an example process:

<?xml version="1.0" encoding="UTF-8"?><process version="9.2.001"><br>&nbsp; <context><br>&nbsp;&nbsp;&nbsp; <input/><br>&nbsp;&nbsp;&nbsp; <output/><br>&nbsp;&nbsp;&nbsp; <macros/><br>&nbsp; </context><br>&nbsp; <operator activated="true" class="process" compatibility="9.2.001" expanded="true" name="Process"><br>&nbsp;&nbsp;&nbsp; <parameter key="logverbosity" value="init"/><br>&nbsp;&nbsp;&nbsp; <parameter key="random_seed" value="2001"/><br>&nbsp;&nbsp;&nbsp; <parameter key="send_mail" value="never"/><br>&nbsp;&nbsp;&nbsp; <parameter key="notification_email" value=""/><br>&nbsp;&nbsp;&nbsp; <parameter key="process_duration_for_mail" value="30"/><br>&nbsp;&nbsp;&nbsp; <parameter key="encoding" value="SYSTEM"/><br>&nbsp;&nbsp;&nbsp; <process expanded="true"><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <operator activated="true" class="concurrency:loop" compatibility="9.2.001" expanded="true" height="82" name="Generate Data" width="90" x="179" y="187"><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="number_of_iterations" value="50"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="iteration_macro" value="iteration"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="reuse_results" value="false"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="enable_parallel_execution" value="true"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <process expanded="true"><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <operator activated="true" class="utility:create_exampleset" compatibility="9.2.001" expanded="true" height="68" name="Create ExampleSet" width="90" x="112" y="85"><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="generator_type" value="numeric series"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="number_of_examples" value="1000"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="use_stepsize" value="true"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <list key="function_descriptions"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="add_id_attribute" value="false"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <list key="numeric_series_configuration"><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="Att1" value="linear.0\.0.1\.0"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </list><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <list key="date_series_configuration"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <list key="date_series_configuration (interval)"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="date_format" value="yyyy-MM-dd HH:mm:ss"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="time_zone" value="SYSTEM"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="column_separator" value=","/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="parse_all_as_nominal" value="false"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="decimal_point_character" value="."/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="trim_attribute_names" value="true"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </operator><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <connect from_op="Create ExampleSet" from_port="output" to_port="output 1"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <portSpacing port="source_input 1" spacing="0"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <portSpacing port="sink_output 1" spacing="0"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <portSpacing port="sink_output 2" spacing="0"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </process><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </operator><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <operator activated="true" class="concurrency:loop" compatibility="9.2.001" expanded="true" height="82" name="Execute R Script" width="90" x="380" y="187"><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="number_of_iterations" value="50"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="iteration_macro" value="iteration"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="reuse_results" value="false"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="enable_parallel_execution" value="true"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <process expanded="true"><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <operator activated="true" class="select" compatibility="9.2.001" expanded="true" height="68" name="Select" width="90" x="179" y="85"><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="index" value="%{iteration}"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="unfold" value="false"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </operator><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <operator activated="true" class="r_scripting:execute_r" compatibility="9.1.000" expanded="true" height="103" name="Execute R" width="90" x="380" y="85"><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <parameter key="script" value="# rm_main is a mandatory function, &#10;# the number of arguments has to be the number of input ports (can be none)&#10;rm_main = function(data)&#10;{&#10;&#9;res &lt;- mean(data$Att1)&#10;&#9;library(qdapTools)&#10;&#9;return (list2df(res))&#10;}&#10;"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </operator><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <connect from_port="input 1" to_op="Select" to_port="collection"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <connect from_op="Select" from_port="selected" to_op="Execute R" to_port="input 1"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <connect from_op="Execute R" from_port="output 1" to_port="output 1"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <portSpacing port="source_input 1" spacing="0"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <portSpacing port="source_input 2" spacing="0"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <portSpacing port="sink_output 1" spacing="0"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <portSpacing port="sink_output 2" spacing="0"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </process><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </operator><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <connect from_op="Generate Data" from_port="output 1" to_op="Execute R Script" to_port="input 1"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <connect from_op="Execute R Script" from_port="output 1" to_port="result 1"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <portSpacing port="source_input 1" spacing="0"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <portSpacing port="sink_result 1" spacing="0"/><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <portSpacing port="sink_result 2" spacing="0"/><br>&nbsp;&nbsp;&nbsp; </process><br>&nbsp; </operator><br></process>

Could you please help me, how to fix that?

Regards
Maciek


Best Answers

  • Options
    jmerglerjmergler Administrator, Moderator, Employee, RapidMiner Certified Analyst, Member, University Professor Posts: 41 Guru
    Solution Accepted
    Hi Maciek,
    I ran a similar process and it worked in parallel. Could you please check Settings - Manage Licenses, to make sure your current license will handle the multiple logical processors?
  • Options
    David_ADavid_A Administrator, Moderator, Employee, RMResearcher, Member Posts: 297 RM Research
    edited April 2019 Solution Accepted
    Hi Maciek,

    @jmergler is correct. From a technical standpoint there is no reason why it shouldn't run in parallel, if your license supports multiple cores.

    Best,
    David

Answers

  • Options
    SGolbertSGolbert RapidMiner Certified Analyst, Member Posts: 344 Unicorn
    edited April 2019

    I just tested it with my Studio Large license and it runs in parallel:



    I modified the script because I didn't want to install the library:
    # rm_main is a mandatory function, <br># the number of arguments has to be the number of input ports (can be none)<br>rm_main = function(data)<br>{<br>	res <- mean(data$Att1)<br>	data$Att1 = data$Att1 - res<br>	return (data)<br>}<br>

    I think it could either be a license or a OS (macos, linux) problem. In case of mac/linux, what Java package are you using?


    Also note that if the computation is very intensive, it may be better to move the parallelization to R. You also have full access to your hardware in R (sorry, I had to say it).

    Regards,
    Sebastian

  • Options
    maciekmaciek Member Posts: 13 Contributor II

    Thank you for your answers, the right answer was more obvious than I thought!

    Indeed, the script works great in parallel, however not if I am running the educational license for academic use.

    Best regards
    Maciek
Sign In or Register to comment.