Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

[solved]Running Different Data on same operators and 2 others Q

blueearthblueearth Member Posts: 42 Contributor II
edited November 2018 in Help
Hi ...i have 11 different data sets and i want to run them on some operators in one process
but i want to to know how to do that with out duplicating operators for 11 times
i mean is there a way to connect all data sets to operators and they read them respectively and  append all data with log and log to data and make a final exel from them?

Second Question: if there is no way for doing what i described ..how can i copy operators in same process with out change in their arrangment...cause when i copy operators they don't  have the original arrangment ..
thanks

Third Question : if i got two processes ...is there a way that second process start automatically as the first process finished?

Answers

  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Very good questions. I split my answer into three posts for better readability.

    First Answer: You can Collect your data and then use Loop Collection. Please see the attached process.
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.2.006">
     <context>
       <input/>
       <output/>
       <macros/>
     </context>
     <operator activated="true" class="process" compatibility="5.2.006" expanded="true" name="Process">
       <process expanded="true" height="501" width="770">
         <operator activated="true" class="generate_data" compatibility="5.2.006" expanded="true" height="60" name="Generate Data" width="90" x="45" y="75">
           <parameter key="target_function" value="multi classification"/>
         </operator>
         <operator activated="true" class="generate_data" compatibility="5.2.006" expanded="true" height="60" name="Generate Data (2)" width="90" x="45" y="165">
           <parameter key="target_function" value="multi classification"/>
         </operator>
         <operator activated="true" class="generate_data" compatibility="5.2.006" expanded="true" height="60" name="Generate Data (3)" width="90" x="45" y="255">
           <parameter key="target_function" value="multi classification"/>
         </operator>
         <operator activated="true" class="generate_data" compatibility="5.2.006" expanded="true" height="60" name="Generate Data (4)" width="90" x="45" y="345">
           <parameter key="target_function" value="multi classification"/>
         </operator>
         <operator activated="true" class="collect" compatibility="5.2.006" expanded="true" height="130" name="Collect" width="90" x="179" y="75"/>
         <operator activated="true" class="loop_collection" compatibility="5.2.006" expanded="true" height="76" name="Loop Collection" width="90" x="313" y="75">
           <process expanded="true" height="501" width="614">
             <operator activated="true" class="naive_bayes" compatibility="5.2.006" expanded="true" height="76" name="Naive Bayes" width="90" x="45" y="30"/>
             <operator activated="true" class="apply_model" compatibility="5.2.006" expanded="true" height="76" name="Apply Model" width="90" x="180" y="30">
               <list key="application_parameters"/>
             </operator>
             <operator activated="true" class="performance" compatibility="5.2.006" expanded="true" height="76" name="Performance" width="90" x="313" y="30"/>
             <operator activated="true" class="log" compatibility="5.2.006" expanded="true" height="76" name="Log" width="90" x="447" y="30">
               <list key="log">
                 <parameter key="performance" value="operator.Performance.value.performance"/>
               </list>
             </operator>
             <connect from_port="single" to_op="Naive Bayes" to_port="training set"/>
             <connect from_op="Naive Bayes" from_port="model" to_op="Apply Model" to_port="model"/>
             <connect from_op="Naive Bayes" from_port="exampleSet" to_op="Apply Model" to_port="unlabelled data"/>
             <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
             <connect from_op="Performance" from_port="performance" to_op="Log" to_port="through 1"/>
             <connect from_op="Log" from_port="through 1" to_port="output 1"/>
             <portSpacing port="source_single" spacing="0"/>
             <portSpacing port="sink_output 1" spacing="0"/>
             <portSpacing port="sink_output 2" spacing="0"/>
           </process>
         </operator>
         <operator activated="true" breakpoints="after" class="log_to_data" compatibility="5.2.006" expanded="true" height="94" name="Log to Data" width="90" x="447" y="75">
           <parameter key="log_name" value="Log"/>
         </operator>
         <operator activated="true" class="write_excel" compatibility="5.2.006" expanded="true" height="76" name="Write Excel" width="90" x="581" y="75">
           <parameter key="excel_file" value="tmp.xls"/>
         </operator>
         <connect from_op="Generate Data" from_port="output" to_op="Collect" to_port="input 1"/>
         <connect from_op="Generate Data (2)" from_port="output" to_op="Collect" to_port="input 2"/>
         <connect from_op="Generate Data (3)" from_port="output" to_op="Collect" to_port="input 3"/>
         <connect from_op="Generate Data (4)" from_port="output" to_op="Collect" to_port="input 4"/>
         <connect from_op="Collect" from_port="collection" to_op="Loop Collection" to_port="collection"/>
         <connect from_op="Loop Collection" from_port="output 1" to_op="Log to Data" to_port="through 1"/>
         <connect from_op="Log to Data" from_port="exampleSet" to_op="Write Excel" to_port="input"/>
         <connect from_op="Write Excel" from_port="through" to_port="result 1"/>
         <portSpacing port="source_input 1" spacing="0"/>
         <portSpacing port="sink_result 1" spacing="0"/>
         <portSpacing port="sink_result 2" spacing="0"/>
       </process>
     </operator>
    </process>
  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Second answer: There is already a bug report for the copy-paste issue: http://bugs.rapid-i.com/show_bug.cgi?id=1018
  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Third Question : if i got two processes ...is there a way that second process start automatically as the first process finished?

    Yes, you can use the Execute Process operator for this. Just put it at the end of your first process to call the second process.
    You can even pass data and macros from one process to another.

    Best, Marius
Sign In or Register to comment.