🦉 🎤   RapidMiner Wisdom 2020 - CALL FOR SPEAKERS   🦉 🎤

We are inviting all community members to submit proposals to speak at Wisdom 2020 in Boston.


Whether it's a cool RapidMiner trick or a use case implementation, we want to see what you have.
Form link is below and deadline for submissions is November 15. See you in Boston!

CLICK HERE TO GO TO ENTRY FORM

Multiply operator outputs

poppop Member Posts: 21  Maven
edited November 2018 in Help
Hello,
Here is a basic process to look at the Multiply operator outputs.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.0">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.0.10" expanded="true" name="Process">
    <process expanded="true" height="415" width="681">
      <operator activated="true" class="generate_data" compatibility="5.0.10" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30">
        <parameter key="number_examples" value="10"/>
        <parameter key="number_of_attributes" value="2"/>
        <parameter key="attributes_lower_bound" value="20.0"/>
        <parameter key="attributes_upper_bound" value="30.0"/>
      </operator>
      <operator activated="true" class="multiply" compatibility="5.0.10" expanded="true" height="94" name="Multiply" width="90" x="179" y="30"/>
      <operator activated="true" class="normalize" compatibility="5.0.10" expanded="true" height="94" name="Normalize" width="90" x="313" y="75">
        <parameter key="method" value="range transformation"/>
      </operator>
      <connect from_op="Generate Data" from_port="output" to_op="Multiply" to_port="input"/>
      <connect from_op="Multiply" from_port="output 1" to_port="result 1"/>
      <connect from_op="Multiply" from_port="output 2" to_op="Normalize" to_port="example set input"/>
      <connect from_op="Normalize" from_port="example set output" to_port="result 2"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <portSpacing port="sink_result 3" spacing="0"/>
    </process>
  </operator>
</process>
Is it normal to have the same result on both outputs?
I would have expexcted one to be normalised and not the other.
Thank you very much.

Pop

Answers

  • haddockhaddock Member Posts: 849  Guru
    Hi there,

    If you check out the help for the normalise operator you will see the following..
    Create View to apply preprocessing instead of changing the data
    Your setup does not create a view, and therefore changes the data, if you select this option you will get more likewhat you were expecting. At least I do  ;D

  • poppop Member Posts: 21  Maven
    Hi Haddock,
    Thank you very much for your help.
    I think I found the answer in the description of the Multiply operator:
    Note that objects are copied by reference, hence the underlying data of ExampleSets is never copied (unless using a Materialize Data operator). Therefore, copying objects is cheap. When copying ExampleSets only references to attributes are copied. When attributes are changed or added to one example set, this change is invisible to the other copies. However, if data is modified in one thread of the process flow, it is also modified in the other copies.
    So with the materialize data operator:
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.0">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.0.10" expanded="true" name="Process">
        <process expanded="true" height="415" width="681">
          <operator activated="true" class="generate_data" compatibility="5.0.10" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30">
            <parameter key="number_examples" value="10"/>
            <parameter key="number_of_attributes" value="2"/>
            <parameter key="attributes_lower_bound" value="20.0"/>
            <parameter key="attributes_upper_bound" value="30.0"/>
          </operator>
          <operator activated="true" class="multiply" compatibility="5.0.10" expanded="true" height="94" name="Multiply" width="90" x="179" y="30"/>
          <operator activated="true" class="materialize_data" compatibility="5.0.10" expanded="true" height="76" name="Materialize Data" width="90" x="380" y="30"/>
          <operator activated="true" class="normalize" compatibility="5.0.10" expanded="true" height="94" name="Normalize" width="90" x="380" y="165">
            <parameter key="method" value="range transformation"/>
          </operator>
          <connect from_op="Generate Data" from_port="output" to_op="Multiply" to_port="input"/>
          <connect from_op="Multiply" from_port="output 1" to_op="Materialize Data" to_port="example set input"/>
          <connect from_op="Multiply" from_port="output 2" to_op="Normalize" to_port="example set input"/>
          <connect from_op="Materialize Data" from_port="example set output" to_port="result 1"/>
          <connect from_op="Normalize" from_port="example set output" to_port="result 2"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
          <portSpacing port="sink_result 3" spacing="126"/>
        </process>
      </operator>
    </process>
    It now behave as I was expected.
    thank you.

    Pop
  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,666  RM Founder
    Hi,

    that's all true but in this case you have doubled the amount of used memory since you now have to keep two complete data sets in memory. The cool thing about the view approach described by Haddock is that the data is kept only once which can be especially important for larger data sets. But as I said: if you are happy with it I certainly be happy as well  :)

    Cheers,
    Ingo

    RapidMiner Wisdom 2020
    February 11th and 12th 2020 in Boston, MA, USA

Sign In or Register to comment.