🎉 🎉 RAPIDMINER 9.10 IS OUT!!! 🎉🎉

Download the latest version helping analytics teams accelerate time-to-value for streaming and IIOT use cases.

CLICK HERE TO DOWNLOAD

"Loop Operator input and output problem"

QingqiuQingqiu Member Posts: 8 Contributor II
edited May 2019 in Help
In the "Loop" operator description it says:
The output of each nested operator is the input for the following one, the output of the last inner operator will be the input for the first child in the next iteration. The output of the last operator in the last iteration will be the output of this operator.
.
while when I use it to do iterated splittings, the input is always the same for each iteration.

My purpose is just to split a data set into subset1 and 2 and then split subset1 into subset 3 and 4 and so on. The code is here:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.0">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" expanded="true" name="Process">
    <process expanded="true" height="521" width="820">
      <operator activated="true" class="generate_data" expanded="true" height="60" name="Generate Data" width="90" x="45" y="75"/>
      <operator activated="true" class="loop" expanded="true" height="76" name="Loop" width="90" x="179" y="120">
        <parameter key="iterations" value="2"/>
        <process expanded="true" height="488" width="877">
          <operator activated="true" class="split_data" expanded="true" height="76" name="Split Data" width="90" x="112" y="120">
            <enumeration key="partitions">
              <parameter key="ratio" value="0.6"/>
              <parameter key="ratio" value="0.4"/>
            </enumeration>
          </operator>
          <connect from_port="input 1" to_op="Split Data" to_port="example set"/>
          <connect from_op="Split Data" from_port="partition 1" to_port="output 1"/>
          <portSpacing port="source_input 1" spacing="90"/>
          <portSpacing port="source_input 2" spacing="0"/>
          <portSpacing port="sink_output 1" spacing="0"/>
          <portSpacing port="sink_output 2" spacing="0"/>
        </process>
      </operator>
      <connect from_op="Generate Data" from_port="output" to_op="Loop" to_port="input 1"/>
      <connect from_op="Loop" from_port="output 1" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>
My expectation is that the output in iteration1 is the input of iteration 2 so after 2 iterations the output sample size should be 36=0.6*0.6*100 (100 is the original sample size,0.6 is just a ratio) while it is still 60=0.6*100. Hope I make it clear...Thanks for any help:)

Answers

  • QingqiuQingqiu Member Posts: 8 Contributor II
    I tried on RM4.6 and it works. The final output sample size is 36 after two iterations. Dun know why it is different in RM5....
  • haddockhaddock Member Posts: 849  Maven
    Hi there!

    I agree that the documentation is not as clear as it might be, but it may actually true! If you want to do recursion you may be better off storing, manipulating, and retrieving the dataset explicitly, like this..
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.0">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.0.0" expanded="true" name="Process">
        <process expanded="true" height="358" width="614">
          <operator activated="true" class="generate_data" compatibility="5.0.0" expanded="true" height="60" name="Generate Data" width="90" x="45" y="75"/>
          <operator activated="true" class="store" compatibility="5.0.11" expanded="true" height="60" name="Store" width="90" x="179" y="75">
            <parameter key="repository_entry" value="bla"/>
          </operator>
          <operator activated="true" class="loop" compatibility="5.0.0" expanded="true" height="76" name="Loop" width="90" x="313" y="75">
            <parameter key="iterations" value="2"/>
            <process expanded="true" height="488" width="877">
              <operator activated="true" class="retrieve" compatibility="5.0.11" expanded="true" height="60" name="Retrieve (2)" width="90" x="112" y="120">
                <parameter key="repository_entry" value="bla"/>
              </operator>
              <operator activated="true" class="split_data" compatibility="5.0.0" expanded="true" height="76" name="Split Data" width="90" x="246" y="120">
                <enumeration key="partitions">
                  <parameter key="ratio" value="0.6"/>
                  <parameter key="ratio" value="0.4"/>
                </enumeration>
              </operator>
              <operator activated="true" class="store" compatibility="5.0.11" expanded="true" height="60" name="Store (2)" width="90" x="380" y="120">
                <parameter key="repository_entry" value="bla"/>
              </operator>
              <connect from_op="Retrieve (2)" from_port="output" to_op="Split Data" to_port="example set"/>
              <connect from_op="Split Data" from_port="partition 1" to_op="Store (2)" to_port="input"/>
              <portSpacing port="source_input 1" spacing="90"/>
              <portSpacing port="source_input 2" spacing="54"/>
              <portSpacing port="sink_output 1" spacing="0"/>
            </process>
          </operator>
          <operator activated="true" class="retrieve" compatibility="5.0.11" expanded="true" height="60" name="Retrieve" width="90" x="313" y="210">
            <parameter key="repository_entry" value="bla"/>
          </operator>
          <connect from_op="Generate Data" from_port="output" to_op="Store" to_port="input"/>
          <connect from_op="Store" from_port="through" to_op="Loop" to_port="input 1"/>
          <connect from_op="Retrieve" from_port="output" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    Clunky, yes, but it produces 36 examples  :o


  • QingqiuQingqiu Member Posts: 8 Contributor II
    Thank you haddock. It works although not as the way I thought~~~will try other methods. Thanks:)
Sign In or Register to comment.