Options

Multiply operator outputs

poppop Member Posts: 21 Maven
edited November 2018 in Help
Hello,
Here is a basic process to look at the Multiply operator outputs.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.0">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.0.10" expanded="true" name="Process">
    <process expanded="true" height="415" width="681">
      <operator activated="true" class="generate_data" compatibility="5.0.10" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30">
        <parameter key="number_examples" value="10"/>
        <parameter key="number_of_attributes" value="2"/>
        <parameter key="attributes_lower_bound" value="20.0"/>
        <parameter key="attributes_upper_bound" value="30.0"/>
      </operator>
      <operator activated="true" class="multiply" compatibility="5.0.10" expanded="true" height="94" name="Multiply" width="90" x="179" y="30"/>
      <operator activated="true" class="normalize" compatibility="5.0.10" expanded="true" height="94" name="Normalize" width="90" x="313" y="75">
        <parameter key="method" value="range transformation"/>
      </operator>
      <connect from_op="Generate Data" from_port="output" to_op="Multiply" to_port="input"/>
      <connect from_op="Multiply" from_port="output 1" to_port="result 1"/>
      <connect from_op="Multiply" from_port="output 2" to_op="Normalize" to_port="example set input"/>
      <connect from_op="Normalize" from_port="example set output" to_port="result 2"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <portSpacing port="sink_result 3" spacing="0"/>
    </process>
  </operator>
</process>
Is it normal to have the same result on both outputs?
I would have expexcted one to be normalised and not the other.
Thank you very much.

Pop

Answers

  • Options
    haddockhaddock Member Posts: 849 Maven
    Hi there,

    If you check out the help for the normalise operator you will see the following..
    Create View to apply preprocessing instead of changing the data
    Your setup does not create a view, and therefore changes the data, if you select this option you will get more likewhat you were expecting. At least I do  ;D

  • Options
    poppop Member Posts: 21 Maven
    Hi Haddock,
    Thank you very much for your help.
    I think I found the answer in the description of the Multiply operator:
    Note that objects are copied by reference, hence the underlying data of ExampleSets is never copied (unless using a Materialize Data operator). Therefore, copying objects is cheap. When copying ExampleSets only references to attributes are copied. When attributes are changed or added to one example set, this change is invisible to the other copies. However, if data is modified in one thread of the process flow, it is also modified in the other copies.
    So with the materialize data operator:
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.0">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.0.10" expanded="true" name="Process">
        <process expanded="true" height="415" width="681">
          <operator activated="true" class="generate_data" compatibility="5.0.10" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30">
            <parameter key="number_examples" value="10"/>
            <parameter key="number_of_attributes" value="2"/>
            <parameter key="attributes_lower_bound" value="20.0"/>
            <parameter key="attributes_upper_bound" value="30.0"/>
          </operator>
          <operator activated="true" class="multiply" compatibility="5.0.10" expanded="true" height="94" name="Multiply" width="90" x="179" y="30"/>
          <operator activated="true" class="materialize_data" compatibility="5.0.10" expanded="true" height="76" name="Materialize Data" width="90" x="380" y="30"/>
          <operator activated="true" class="normalize" compatibility="5.0.10" expanded="true" height="94" name="Normalize" width="90" x="380" y="165">
            <parameter key="method" value="range transformation"/>
          </operator>
          <connect from_op="Generate Data" from_port="output" to_op="Multiply" to_port="input"/>
          <connect from_op="Multiply" from_port="output 1" to_op="Materialize Data" to_port="example set input"/>
          <connect from_op="Multiply" from_port="output 2" to_op="Normalize" to_port="example set input"/>
          <connect from_op="Materialize Data" from_port="example set output" to_port="result 1"/>
          <connect from_op="Normalize" from_port="example set output" to_port="result 2"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
          <portSpacing port="sink_result 3" spacing="126"/>
        </process>
      </operator>
    </process>
    It now behave as I was expected.
    thank you.

    Pop
  • Options
    IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi,

    that's all true but in this case you have doubled the amount of used memory since you now have to keep two complete data sets in memory. The cool thing about the view approach described by Haddock is that the data is kept only once which can be especially important for larger data sets. But as I said: if you are happy with it I certainly be happy as well  :)

    Cheers,
    Ingo

Sign In or Register to comment.