Options

how to clone a exampleSet [SOLVED]

sfmoraissfmorais Member Posts: 13 Contributor II
edited November 2018 in Help
Hi,

In the middle of my process I have a 'loop examples' operator.
In each iteration I must change temporarily the values of some attributes of current example and I want 'rollback' this changes in the end of iteration.

I have the problem that the changes that I did is always reflected in main exampleSet (of the loop).

How can I do to clone the current example ?

Thanks,
Sérgio

Answers

  • Options
    [Deleted User][Deleted User] Posts: 0 Learner II
    Hi Sérgio,

    one possibility to keep your original ExampleSet is to use the Multiply-Operator before you enter the loop.
    By Generate Copy you can keep single attributes if this would be sufficient for you.

    Do you really need to change the attribute values?
    By using Extract Macro you could fetch the attribute value of the certain attribute for the actual example (addressed by: %{example} )
    If you then want to use this value, e.g. to compute a new value you can address this macro by: %{macro_name}.

    Hope this helps.

    Cheers,
    Edin
  • Options
    sfmoraissfmorais Member Posts: 13 Contributor II
    Hi Edin,

    I workaround my problem by other way. The Generate Copy is not enough for my case, because my example set can have hundreds of attributes.
    I thought that Rapid Miner had a very usefull clone example operator.

    Thanks

    Cheers,
    Edin
  • Options
    haddockhaddock Member Posts: 849 Maven
    Hi Sergio,

    I've probably got the wrong end of the stick, but I think the trick is to pass the original set round and round, like this.
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.2.001">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.2.001" expanded="true" name="Process">
        <process expanded="true" height="409" width="661">
          <operator activated="true" class="generate_data" compatibility="5.2.001" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30"/>
          <operator activated="true" class="loop_examples" compatibility="5.2.001" expanded="true" height="76" name="Loop Examples" width="90" x="246" y="30">
            <process expanded="true" height="409" width="681">
              <operator activated="true" class="extract_log_value" compatibility="5.2.001" expanded="true" height="60" name="Extract Log Value" width="90" x="45" y="30">
                <parameter key="attribute_name" value="att1"/>
                <parameter key="example_index" value="%{example}"/>
              </operator>
              <operator activated="true" class="set_data" compatibility="5.2.001" expanded="true" height="76" name="Set Data" width="90" x="179" y="30">
                <parameter key="example_index" value="%{example}"/>
                <parameter key="attribute_name" value="att1"/>
                <parameter key="value" value="42"/>
                <list key="additional_values"/>
              </operator>
              <operator activated="true" class="provide_macro_as_log_value" compatibility="5.2.001" expanded="true" height="76" name="Provide Macro as Log Value" width="90" x="313" y="120">
                <parameter key="macro_name" value="example"/>
              </operator>
              <operator activated="true" class="extract_log_value" compatibility="5.2.001" expanded="true" height="60" name="Extract Log Value (2)" width="90" x="447" y="120">
                <parameter key="attribute_name" value="att1"/>
                <parameter key="example_index" value="%{example}"/>
              </operator>
              <operator activated="true" class="log" compatibility="5.2.001" expanded="true" height="76" name="Log" width="90" x="514" y="120">
                <list key="log">
                  <parameter key="Example" value="operator.Loop Examples.value.iteration"/>
                  <parameter key="Before" value="operator.Extract Log Value.value.data_value"/>
                  <parameter key="After" value="operator.Extract Log Value (2).value.data_value"/>
                </list>
              </operator>
              <connect from_port="example set" to_op="Extract Log Value" to_port="example set"/>
              <connect from_op="Extract Log Value" from_port="example set" to_op="Set Data" to_port="example set input"/>
              <connect from_op="Set Data" from_port="example set output" to_op="Provide Macro as Log Value" to_port="through 1"/>
              <connect from_op="Set Data" from_port="original" to_port="example set"/>
              <connect from_op="Provide Macro as Log Value" from_port="through 1" to_op="Extract Log Value (2)" to_port="example set"/>
              <connect from_op="Extract Log Value (2)" from_port="example set" to_op="Log" to_port="through 1"/>
              <connect from_op="Log" from_port="through 1" to_port="output 1"/>
              <portSpacing port="source_example set" spacing="0"/>
              <portSpacing port="sink_example set" spacing="0"/>
              <portSpacing port="sink_output 1" spacing="0"/>
              <portSpacing port="sink_output 2" spacing="0"/>
            </process>
          </operator>
          <connect from_op="Generate Data" from_port="output" to_op="Loop Examples" to_port="example set"/>
          <connect from_op="Loop Examples" from_port="example set" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    If you look at the log you'll see that 'att1' in each example got changed, and the change got rolled back, sort of anyway !

    Good weekend

  • Options
    sfmoraissfmorais Member Posts: 13 Contributor II
    Hi haddock

    Thanks for your reply.

    To explain better my problem I have this document:
    http://www.lavradeirasarcozelo.com/extra/description.pdf


    And the code of example of document:
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.2.000">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.2.000" expanded="true" name="Process">
        <process expanded="true" height="463" width="567">
          <operator activated="true" class="generate_data" compatibility="5.2.000" expanded="true" height="60" name="Generate Data" width="90" x="29" y="43">
            <parameter key="number_examples" value="5"/>
            <parameter key="datamanagement" value="int_array"/>
          </operator>
          <operator activated="true" class="generate_id" compatibility="5.2.000" expanded="true" height="76" name="Generate ID" width="90" x="179" y="30"/>
          <operator activated="true" class="select_attributes" compatibility="5.2.000" expanded="true" height="76" name="Select Attributes" width="90" x="313" y="30">
            <parameter key="attribute_filter_type" value="single"/>
            <parameter key="attribute" value="label"/>
            <parameter key="invert_selection" value="true"/>
            <parameter key="include_special_attributes" value="true"/>
          </operator>
          <operator activated="true" class="loop_examples" compatibility="5.2.000" expanded="true" height="94" name="Loop Examples" width="90" x="447" y="30">
            <parameter key="iteration_macro" value="example_index"/>
            <process expanded="true" height="482" width="480">
              <operator activated="true" class="extract_macro" compatibility="5.2.000" expanded="true" height="60" name="Extract Macro" width="90" x="45" y="30">
                <parameter key="macro" value="id_value"/>
                <parameter key="macro_type" value="data_value"/>
                <parameter key="attribute_name" value="id"/>
                <parameter key="example_index" value="%{example_index}"/>
              </operator>
              <operator activated="true" class="set_data" compatibility="5.2.000" expanded="true" height="76" name="Set Data" width="90" x="179" y="30">
                <parameter key="example_index" value="%{example_index}"/>
                <parameter key="attribute_name" value="att2"/>
                <parameter key="value" value="100"/>
                <list key="additional_values">
                  <parameter key="att3" value="100"/>
                  <parameter key="att4" value="100"/>
                </list>
              </operator>
              <operator activated="true" class="aggregate" compatibility="5.2.000" expanded="true" height="76" name="Aggregate" width="90" x="313" y="30">
                <parameter key="use_default_aggregation" value="true"/>
                <parameter key="default_aggregation_function" value="sum"/>
                <list key="aggregation_attributes"/>
              </operator>
              <connect from_port="example set" to_op="Extract Macro" to_port="example set"/>
              <connect from_op="Extract Macro" from_port="example set" to_op="Set Data" to_port="example set input"/>
              <connect from_op="Set Data" from_port="example set output" to_op="Aggregate" to_port="example set input"/>
              <connect from_op="Aggregate" from_port="example set output" to_port="output 1"/>
              <portSpacing port="source_example set" spacing="0"/>
              <portSpacing port="sink_example set" spacing="0"/>
              <portSpacing port="sink_output 1" spacing="0"/>
              <portSpacing port="sink_output 2" spacing="0"/>
            </process>
          </operator>
          <connect from_op="Generate Data" from_port="output" to_op="Generate ID" to_port="example set input"/>
          <connect from_op="Generate ID" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
          <connect from_op="Select Attributes" from_port="example set output" to_op="Loop Examples" to_port="example set"/>
          <connect from_op="Loop Examples" from_port="output 1" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>

    Many thanks


    Best Regards,
    Sérgio
  • Options
    haddockhaddock Member Posts: 849 Maven
    Hi Sergio,

    I've read your pdf ( it always helps to have clear examples like that ), and think we are nearly there!

    For each Example Row
     Clone the original Example table
     Change the Row in the Clone
      Aggregate the Clone
      Store the Clone and Aggregate
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.2.003">
     <context>
       <input/>
       <output/>
       <macros/>
     </context>
     <operator activated="true" class="process" compatibility="5.2.003" expanded="true" name="Process">
       <process expanded="true" height="409" width="661">
         <operator activated="true" class="generate_data" compatibility="5.2.003" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30">
           <parameter key="number_examples" value="10"/>
         </operator>
         <operator activated="true" class="loop_examples" compatibility="5.2.003" expanded="true" height="94" name="Loop Examples" width="90" x="246" y="30">
           <process expanded="true" height="409" width="820">
             <operator activated="true" class="set_data" compatibility="5.2.003" expanded="true" height="76" name="Set Data" width="90" x="179" y="30">
               <parameter key="example_index" value="%{example}"/>
               <parameter key="attribute_name" value="att1"/>
               <parameter key="value" value="42"/>
               <list key="additional_values"/>
             </operator>
             <operator activated="true" class="aggregate" compatibility="5.2.003" expanded="true" height="76" name="Aggregate" width="90" x="313" y="165">
               <parameter key="use_default_aggregation" value="true"/>
               <list key="aggregation_attributes"/>
             </operator>
             <operator activated="true" class="collect" compatibility="5.2.003" expanded="true" height="94" name="Collect" width="90" x="514" y="165">
               <parameter key="unfold" value="true"/>
             </operator>
             <connect from_port="example set" to_op="Set Data" to_port="example set input"/>
             <connect from_op="Set Data" from_port="example set output" to_op="Aggregate" to_port="example set input"/>
             <connect from_op="Set Data" from_port="original" to_port="example set"/>
             <connect from_op="Aggregate" from_port="example set output" to_op="Collect" to_port="input 1"/>
             <connect from_op="Aggregate" from_port="original" to_op="Collect" to_port="input 2"/>
             <connect from_op="Collect" from_port="collection" to_port="output 1"/>
             <portSpacing port="source_example set" spacing="0"/>
             <portSpacing port="sink_example set" spacing="54"/>
             <portSpacing port="sink_output 1" spacing="36"/>
             <portSpacing port="sink_output 2" spacing="0"/>
           </process>
         </operator>
         <connect from_op="Generate Data" from_port="output" to_op="Loop Examples" to_port="example set"/>
         <connect from_op="Loop Examples" from_port="example set" to_port="result 1"/>
         <connect from_op="Loop Examples" from_port="output 1" to_port="result 2"/>
         <portSpacing port="source_input 1" spacing="0"/>
         <portSpacing port="sink_result 1" spacing="0"/>
         <portSpacing port="sink_result 2" spacing="0"/>
         <portSpacing port="sink_result 3" spacing="0"/>
       </process>
     </operator>
    </process>
    So the real answer to your question is that you can also Clone an example set by using the 'set data' operator, the trick, as hinted above, is to pass the original set round and round. One set goes in, two come out.

    Hope that's closer to it!

    PS There are loads of other ways to copy data tables in RM, I draw this to your attention only because it is less obvious!
  • Options
    sfmoraissfmorais Member Posts: 13 Contributor II

    Hi haddock

    Ok!  Finally I already understand with your last code.

    Many thanks for your help!


    Best Regards,
    Sérgio
Sign In or Register to comment.