Options

Have to Run the whole process from starting always?

NonaNona Member Posts: 15 Contributor II
Hi All,
I am new in rapidminer, i was wondering that if there is a way in which we can start execution of a process from some point where we have already reached in its previous execution(green dot). Some processes are very large and time consuming.Sometimes we need to change  something to experiment with some other thing(model for ex.).Then is there any way by which we need not to run the process from starting all the time. for ex. if a process contains some pre-processing operators, and we need to experiment with different models,then in this case we don't need to rerun all pre-processing steps every time as there is no change in them or if there is an error at some point  and we correct that error then again we have to run the process from starting..how can this be achieved?

Answers

  • Options
    DocMusherDocMusher Member Posts: 333 Unicorn
    Hi,
    Split your processes in different processes, store your results and retrieve them at each new process.
    Cheers
    Sven
  • Options
    JEdwardJEdward RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 578 Unicorn
    Here's a sneaky trick if you don't want to use multiple stores throughout your process. 

    For use with RapidMiner Server reports there are two operators Publish to App & Recall from App.  I'm not going to go in to what they operators actually should be used for with Server (trust me it's really cool  8) ) - instead we're going to use them for a completely different purpose.

    The Publish to App operator actually stores the data in memory as part of your currently open session.  So you can use it in conjunction with 'Recall from App' and the view (View -> Show View) "App Objects". 

    Run the below process, there's an obvious error that is going to cause it to fail, but notice I have used several Publish to App operators within the process.  This allows me to go back after the error is flagged and retrieve the data at the point of failure. 
    Have a play around with this and let me know if you have any questions. 
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="6.4.000">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="6.4.000" expanded="true" name="Process">
        <process expanded="true">
          <operator activated="true" class="retrieve" compatibility="6.4.000" expanded="true" height="60" name="Retrieve Ripley-Set" width="90" x="45" y="75">
            <parameter key="repository_entry" value="//Samples/data/Ripley-Set"/>
          </operator>
          <operator activated="true" class="publish_to_app" compatibility="6.4.000" expanded="true" height="60" name="Publish to App" width="90" x="179" y="75">
            <parameter key="name" value="breakpoint"/>
          </operator>
          <operator activated="true" class="logistic_regression" compatibility="6.4.000" expanded="true" height="94" name="Logistic Regression" width="90" x="179" y="165"/>
          <operator activated="true" class="publish_to_app" compatibility="6.4.000" expanded="true" height="60" name="Publish to App (2)" width="90" x="313" y="165">
            <parameter key="name" value="breakpoint"/>
          </operator>
          <operator activated="true" class="apply_model" compatibility="6.4.000" expanded="true" height="76" name="Apply Model" width="90" x="447" y="165">
            <list key="application_parameters"/>
            <description align="center" color="transparent" colored="false" width="126">Oh no! An Error! How to get my model back!</description>
          </operator>
          <connect from_op="Retrieve Ripley-Set" from_port="output" to_op="Publish to App" to_port="store"/>
          <connect from_op="Publish to App" from_port="stored" to_op="Logistic Regression" to_port="training set"/>
          <connect from_op="Logistic Regression" from_port="model" to_op="Publish to App (2)" to_port="store"/>
          <connect from_op="Publish to App (2)" from_port="stored" to_op="Apply Model" to_port="model"/>
          <connect from_op="Apply Model" from_port="labelled data" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
Sign In or Register to comment.