Options

Concatenate Examples

MuehliManMuehliMan Member Posts: 85 Maven
edited November 2018 in Help
Dear all,

I think there should be an easy solution for this, but I hjust cant find it:

I have an example set with a few examples, lets say:

atts48
atts67
atts90

I would like to generate a "value" (data or macro) that contains all example values, which would be

atts48 , atts 67, atts90

for the example mentioned above.

Thanks a lot for your help!

Best,
Markus

Answers

  • Options
    dragoljubdragoljub Member Posts: 241 Contributor II
    From what I understand it seems you have 3 different examples (rows of attributes), and you want to append three rows of attributes into one row of attributes (adding features from different example sets?). If this is the case I would create the same ID values for each row you want to append and then simply use the join operator to add the additional attributes.

    -Gagi
  • Options
    MuehliManMuehliMan Member Posts: 85 Maven
    Hi Gagi,

    your idea sounds interesting! If the same ID is given multiple times it concatenates the values? I never thought about this? But another question is how to handle multiple values. A Join operator can always join two example sets, so how to concatenate 3, 4 or more examples?

    Best,
    Markus
  • Options
    colocolo Member Posts: 236 Maven
    Hi Markus,

    I'm not really sure if I fully got the requirements. Gagi's suggestion seems to aim at multiple ExampleSets which should be joined somehow. But you want to concatenate the values of all examples (rows) in a single ExampleSet (table)? I'm not sure if there is an easier way using some special operators, but this should still be a simple solution for the task:
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.0">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.0.8" expanded="true" name="Process">
        <process expanded="true" height="190" width="547">
          <operator activated="true" class="generate_nominal_data" compatibility="5.0.8" expanded="true" height="60" name="Generate Nominal Data" width="90" x="45" y="30">
            <parameter key="number_examples" value="10"/>
            <parameter key="number_of_attributes" value="1"/>
          </operator>
          <operator activated="true" class="loop_examples" compatibility="5.0.8" expanded="true" height="76" name="Loop Examples" width="90" x="179" y="30">
            <process expanded="true" height="613" width="786">
              <operator activated="true" class="extract_macro" compatibility="5.0.8" expanded="true" height="60" name="Extract Macro" width="90" x="45" y="30">
                <parameter key="macro" value="example_value"/>
                <parameter key="macro_type" value="data_value"/>
                <parameter key="attribute_name" value="att1"/>
                <parameter key="example_index" value="%{example}"/>
              </operator>
              <operator activated="true" class="branch" compatibility="5.0.8" expanded="true" height="76" name="Branch" width="90" x="179" y="30">
                <parameter key="condition_type" value="macro_defined"/>
                <parameter key="condition_value" value="examples"/>
                <process expanded="true" height="613" width="368">
                  <operator activated="true" class="generate_macro" compatibility="5.0.8" expanded="true" height="76" name="Generate Macro" width="90" x="45" y="30">
                    <list key="function_descriptions">
                      <parameter key="examples" value="&quot;%{examples}, %{example_value}&quot;"/>
                    </list>
                  </operator>
                  <connect from_port="condition" to_op="Generate Macro" to_port="through 1"/>
                  <connect from_op="Generate Macro" from_port="through 1" to_port="input 1"/>
                  <portSpacing port="source_condition" spacing="0"/>
                  <portSpacing port="source_input 1" spacing="0"/>
                  <portSpacing port="sink_input 1" spacing="0"/>
                  <portSpacing port="sink_input 2" spacing="0"/>
                </process>
                <process expanded="true" height="613" width="368">
                  <operator activated="true" class="generate_macro" compatibility="5.0.8" expanded="true" height="76" name="Generate Macro (2)" width="90" x="45" y="30">
                    <list key="function_descriptions">
                      <parameter key="examples" value="&quot;%{example_value}&quot;"/>
                    </list>
                  </operator>
                  <connect from_port="condition" to_op="Generate Macro (2)" to_port="through 1"/>
                  <connect from_op="Generate Macro (2)" from_port="through 1" to_port="input 1"/>
                  <portSpacing port="source_condition" spacing="0"/>
                  <portSpacing port="source_input 1" spacing="0"/>
                  <portSpacing port="sink_input 1" spacing="0"/>
                  <portSpacing port="sink_input 2" spacing="0"/>
                </process>
              </operator>
              <connect from_port="example set" to_op="Extract Macro" to_port="example set"/>
              <connect from_op="Extract Macro" from_port="example set" to_op="Branch" to_port="condition"/>
              <connect from_op="Branch" from_port="input 1" to_port="example set"/>
              <portSpacing port="source_example set" spacing="0"/>
              <portSpacing port="sink_example set" spacing="0"/>
              <portSpacing port="sink_output 1" spacing="0"/>
            </process>
          </operator>
          <operator activated="true" class="print_to_console" compatibility="5.0.8" expanded="true" height="76" name="Print to Console" width="90" x="313" y="30">
            <parameter key="log_value" value="%{examples}"/>
          </operator>
          <connect from_op="Generate Nominal Data" from_port="output" to_op="Loop Examples" to_port="example set"/>
          <connect from_op="Loop Examples" from_port="example set" to_op="Print to Console" to_port="through 1"/>
          <connect from_op="Print to Console" from_port="through 1" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    Regards,
    Matthias
  • Options
    dragoljubdragoljub Member Posts: 241 Contributor II
    I would do it like this:

    Set IDs the same for examples you want to join in the same row

    Filter on each unique id loop through examples and

    Apply join once and then save results and then join again in a loop

    Finally save the joined data-set.

    The thing that seem strange to me is you have a data table with different examples but you somehow want to combine different examples into one long example?
  • Options
    MuehliManMuehliMan Member Posts: 85 Maven
    Hi,

    to make a a bit more obviously why I want to do this:
    I transpose my dataset and select the first attribute only (the attributes names). Now I want to concatenate them, to log them and eventually use them as regular expression.

    I already thought about your looping workaround. I'll try it.

    Best,
    Markus

  • Options
    RichyRichy Member Posts: 20 Contributor II
    Hi,

    I have almost the same problem but I dont find the solution. I'm very new to rapidminer and I'm not comfortable with macro. I'm trying to concatenate attribute 2 for all the same values of attribute 1. The result should be assigned to attribute 3. Here is a small example :

    att1     att2
    val1     c
    val2     a
    val3     a
    val2     b
    val3     c
    val1     b
    val3     a
    val2     a

    so my result should be

    att1     att2    att3
    val1     c        cb
    val2     a        aba
    val3     a        aca
    val2     b        aba
    val3     c        aca
    val1     b        cb
    val3     a        aca
    val2     a        aba

    I tried to do something on colo's code, but I only get :

    att1     att2    att3
    val1     c        c
    val2     a        ca
    val3     a        caa
    val2     b        caab
    val3     c        caabc
    val1     b        caabcb
    val3     a        caabcba
    val2     a        caabcbaa

    Do you know if there is any simple way to do this kind of thing ?
  • Options
    colocolo Member Posts: 236 Maven
    Welcome Richy!

    My code example aimed at concatenating all example values into on example.

    This here should do the job for you. Looks a bit dirty, but was the first solution coming to my mind ;)
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.0">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.0.8" expanded="true" name="Process">
        <process expanded="true" height="251" width="547">
          <operator activated="true" class="generate_nominal_data" compatibility="5.0.8" expanded="true" height="60" name="Generate Nominal Data" width="90" x="45" y="30">
            <parameter key="number_examples" value="10"/>
            <parameter key="number_of_attributes" value="2"/>
          </operator>
          <operator activated="true" class="generate_empty_attribute" compatibility="5.0.11" expanded="true" height="76" name="Generate Empty Attribute" width="90" x="179" y="30">
            <parameter key="name" value="att3"/>
            <parameter key="value_type" value="polynominal"/>
          </operator>
          <operator activated="true" class="loop_examples" compatibility="5.0.8" expanded="true" height="76" name="Loop Examples" width="90" x="380" y="30">
            <process expanded="true" height="589" width="681">
              <operator activated="true" class="extract_macro" compatibility="5.0.11" expanded="true" height="60" name="Extract Macro (2)" width="90" x="45" y="30">
                <parameter key="macro" value="att1_value"/>
                <parameter key="macro_type" value="data_value"/>
                <parameter key="attribute_name" value="att1"/>
                <parameter key="example_index" value="%{example}"/>
              </operator>
              <operator activated="true" class="generate_macro" compatibility="5.0.11" expanded="true" height="76" name="Generate Macro" width="90" x="179" y="30">
                <list key="function_descriptions">
                  <parameter key="att3_value" value="&quot;&quot;"/>
                </list>
              </operator>
              <operator activated="true" class="filter_examples" compatibility="5.0.11" expanded="true" height="76" name="Filter Examples" width="90" x="179" y="210">
                <parameter key="condition_class" value="attribute_value_filter"/>
                <parameter key="parameter_string" value="att1 = %{att1_value}"/>
              </operator>
              <operator activated="true" class="loop_examples" compatibility="5.0.11" expanded="true" height="76" name="Loop Examples (2)" width="90" x="313" y="210">
                <parameter key="iteration_macro" value="example_inner"/>
                <process expanded="true" height="607" width="786">
                  <operator activated="true" class="extract_macro" compatibility="5.0.11" expanded="true" height="60" name="Extract Macro (3)" width="90" x="45" y="30">
                    <parameter key="macro" value="att2_value"/>
                    <parameter key="macro_type" value="data_value"/>
                    <parameter key="attribute_name" value="att2"/>
                    <parameter key="example_index" value="%{example_inner}"/>
                  </operator>
                  <operator activated="true" class="generate_macro" compatibility="5.0.11" expanded="true" height="76" name="Generate Macro (2)" width="90" x="179" y="30">
                    <list key="function_descriptions">
                      <parameter key="att3_value" value="&quot;%{att3_value}%{att2_value}&quot;"/>
                    </list>
                  </operator>
                  <connect from_port="example set" to_op="Extract Macro (3)" to_port="example set"/>
                  <connect from_op="Extract Macro (3)" from_port="example set" to_op="Generate Macro (2)" to_port="through 1"/>
                  <connect from_op="Generate Macro (2)" from_port="through 1" to_port="example set"/>
                  <portSpacing port="source_example set" spacing="0"/>
                  <portSpacing port="sink_example set" spacing="0"/>
                  <portSpacing port="sink_output 1" spacing="0"/>
                </process>
              </operator>
              <operator activated="true" class="set_data" compatibility="5.0.11" expanded="true" height="76" name="Set Data" width="90" x="313" y="345">
                <parameter key="attribute_name" value="att3"/>
                <parameter key="example_index" value="%{example}"/>
                <parameter key="value" value="%{att3_value}"/>
              </operator>
              <connect from_port="example set" to_op="Extract Macro (2)" to_port="example set"/>
              <connect from_op="Extract Macro (2)" from_port="example set" to_op="Generate Macro" to_port="through 1"/>
              <connect from_op="Generate Macro" from_port="through 1" to_op="Filter Examples" to_port="example set input"/>
              <connect from_op="Filter Examples" from_port="example set output" to_op="Loop Examples (2)" to_port="example set"/>
              <connect from_op="Filter Examples" from_port="original" to_op="Set Data" to_port="example set input"/>
              <connect from_op="Set Data" from_port="example set output" to_port="example set"/>
              <portSpacing port="source_example set" spacing="0"/>
              <portSpacing port="sink_example set" spacing="0"/>
              <portSpacing port="sink_output 1" spacing="0"/>
            </process>
          </operator>
          <connect from_op="Generate Nominal Data" from_port="output" to_op="Generate Empty Attribute" to_port="example set input"/>
          <connect from_op="Generate Empty Attribute" from_port="example set output" to_op="Loop Examples" to_port="example set"/>
          <connect from_op="Loop Examples" from_port="example set" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    Regards,
    Matthias
  • Options
    RichyRichy Member Posts: 20 Contributor II
    Well, thank you very much Matthias!

    It's working very well on this example, but I think the performance isn't very good on large dataset. Indeed, for every example, it's filtering and generating macros even if you already did it before.

    So I tried to sort my dataset by attribute 1, then I add a macro "test" (initialize to "") to test if attribute 1 is the same as the previous iteration. If yes, i just add the att3_value to attribute 3 on the current example. If no, I change the value of "test" and I'm using your process.

    But I'm getting an error when I compare my macro test to the current value of attribute 1 and I don't understand why :
    Message: "" == value0: Unrecognized symbol "value0"

    Maybe my code will be easier to understand :
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.0">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.0.8" expanded="true" name="Process">
        <process expanded="true" height="251" width="681">
          <operator activated="true" class="generate_nominal_data" compatibility="5.0.8" expanded="true" height="60" name="Generate Nominal Data" width="90" x="45" y="30">
            <parameter key="number_examples" value="10"/>
            <parameter key="number_of_attributes" value="2"/>
          </operator>
          <operator activated="true" class="generate_empty_attribute" compatibility="5.0.11" expanded="true" height="76" name="Generate Empty Attribute" width="90" x="179" y="30">
            <parameter key="name" value="att3"/>
            <parameter key="value_type" value="polynominal"/>
          </operator>
          <operator activated="true" class="set_macro" compatibility="5.0.11" expanded="true" height="76" name="Set Macro" width="90" x="313" y="30">
            <parameter key="macro" value="test"/>
            <parameter key="value" value="&quot;&quot;"/>
          </operator>
          <operator activated="true" class="sort" compatibility="5.0.11" expanded="true" height="76" name="Sort" width="90" x="447" y="30">
            <parameter key="attribute_name" value="att1"/>
          </operator>
          <operator activated="true" class="loop_examples" compatibility="5.0.8" expanded="true" height="76" name="Loop Examples" width="90" x="581" y="30">
            <process expanded="true" height="589" width="685">
              <operator activated="true" class="extract_macro" compatibility="5.0.11" expanded="true" height="60" name="Extract Macro (2)" width="90" x="112" y="30">
                <parameter key="macro" value="att1_value"/>
                <parameter key="macro_type" value="data_value"/>
                <parameter key="attribute_name" value="att1"/>
                <parameter key="example_index" value="%{example}"/>
              </operator>
              <operator activated="true" class="branch" compatibility="5.0.11" expanded="true" height="76" name="Branch" width="90" x="246" y="30">
                <parameter key="condition_type" value="expression"/>
                <parameter key="condition_value" value="%{test} == %{att1_value}"/>
                <process expanded="true" height="505" width="312">
                  <operator activated="true" class="set_data" compatibility="5.0.11" expanded="true" height="76" name="Set Data (2)" width="90" x="112" y="30">
                    <parameter key="attribute_name" value="att3"/>
                    <parameter key="example_index" value="%{example}"/>
                    <parameter key="value" value="%{att3_value}"/>
                  </operator>
                  <connect from_port="condition" to_op="Set Data (2)" to_port="example set input"/>
                  <connect from_op="Set Data (2)" from_port="example set output" to_port="input 1"/>
                  <portSpacing port="source_condition" spacing="0"/>
                  <portSpacing port="source_input 1" spacing="0"/>
                  <portSpacing port="sink_input 1" spacing="0"/>
                  <portSpacing port="sink_input 2" spacing="0"/>
                </process>
                <process expanded="true" height="505" width="361">
                  <operator activated="true" class="generate_macro" compatibility="5.0.11" expanded="true" height="76" name="Generate Macro" width="90" x="45" y="30">
                    <list key="function_descriptions">
                      <parameter key="att3_value" value="&quot;&quot;"/>
                    </list>
                  </operator>
                  <operator activated="true" class="generate_macro" compatibility="5.0.11" expanded="true" height="76" name="Generate Macro (3)" width="90" x="179" y="30">
                    <list key="function_descriptions">
                      <parameter key="test" value="&quot;%{att1_value}&quot;"/>
                    </list>
                  </operator>
                  <operator activated="true" class="filter_examples" compatibility="5.0.11" expanded="true" height="76" name="Filter Examples" width="90" x="45" y="120">
                    <parameter key="condition_class" value="attribute_value_filter"/>
                    <parameter key="parameter_string" value="att1 = %{att1_value}"/>
                  </operator>
                  <operator activated="true" class="set_data" compatibility="5.0.11" expanded="true" height="76" name="Set Data" width="90" x="246" y="210">
                    <parameter key="attribute_name" value="att3"/>
                    <parameter key="example_index" value="%{example}"/>
                    <parameter key="value" value="%{att3_value}"/>
                  </operator>
                  <operator activated="true" class="loop_examples" compatibility="5.0.11" expanded="true" height="76" name="Loop Examples (2)" width="90" x="246" y="120">
                    <parameter key="iteration_macro" value="example_inner"/>
                    <process expanded="true" height="505" width="791">
                      <operator activated="true" class="extract_macro" compatibility="5.0.11" expanded="true" height="60" name="Extract Macro (3)" width="90" x="45" y="30">
                        <parameter key="macro" value="att2_value"/>
                        <parameter key="macro_type" value="data_value"/>
                        <parameter key="attribute_name" value="att2"/>
                        <parameter key="example_index" value="%{example_inner}"/>
                      </operator>
                      <operator activated="true" class="generate_macro" compatibility="5.0.11" expanded="true" height="76" name="Generate Macro (2)" width="90" x="418" y="30">
                        <list key="function_descriptions">
                          <parameter key="att3_value" value="&quot;%{att3_value} %{att2_value}&quot;"/>
                        </list>
                      </operator>
                      <connect from_port="example set" to_op="Extract Macro (3)" to_port="example set"/>
                      <connect from_op="Extract Macro (3)" from_port="example set" to_op="Generate Macro (2)" to_port="through 1"/>
                      <connect from_op="Generate Macro (2)" from_port="through 1" to_port="example set"/>
                      <portSpacing port="source_example set" spacing="0"/>
                      <portSpacing port="sink_example set" spacing="0"/>
                      <portSpacing port="sink_output 1" spacing="0"/>
                    </process>
                  </operator>
                  <connect from_port="condition" to_op="Generate Macro" to_port="through 1"/>
                  <connect from_op="Generate Macro" from_port="through 1" to_op="Generate Macro (3)" to_port="through 1"/>
                  <connect from_op="Generate Macro (3)" from_port="through 1" to_op="Filter Examples" to_port="example set input"/>
                  <connect from_op="Filter Examples" from_port="example set output" to_op="Loop Examples (2)" to_port="example set"/>
                  <connect from_op="Filter Examples" from_port="original" to_op="Set Data" to_port="example set input"/>
                  <connect from_op="Set Data" from_port="example set output" to_port="input 1"/>
                  <portSpacing port="source_condition" spacing="0"/>
                  <portSpacing port="source_input 1" spacing="0"/>
                  <portSpacing port="sink_input 1" spacing="180"/>
                  <portSpacing port="sink_input 2" spacing="0"/>
                </process>
              </operator>
              <connect from_port="example set" to_op="Extract Macro (2)" to_port="example set"/>
              <connect from_op="Extract Macro (2)" from_port="example set" to_op="Branch" to_port="condition"/>
              <connect from_op="Branch" from_port="input 1" to_port="example set"/>
              <portSpacing port="source_example set" spacing="0"/>
              <portSpacing port="sink_example set" spacing="0"/>
              <portSpacing port="sink_output 1" spacing="0"/>
            </process>
          </operator>
          <connect from_op="Generate Nominal Data" from_port="output" to_op="Generate Empty Attribute" to_port="example set input"/>
          <connect from_op="Generate Empty Attribute" from_port="example set output" to_op="Set Macro" to_port="through 1"/>
          <connect from_op="Set Macro" from_port="through 1" to_op="Sort" to_port="example set input"/>
          <connect from_op="Sort" from_port="example set output" to_op="Loop Examples" to_port="example set"/>
          <connect from_op="Loop Examples" from_port="example set" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    Thanks for your help.
  • Options
    colocolo Member Posts: 236 Maven
    Hi Richy,

    to get your process running you should replace the "Set Macro" operator used for the generation of macro "test" (right before the Sort operator) by a "Generate Macro" operator. Unfortunately you have to be carefull using quotes for defining strings. Basically "Generate ..." operators use an expression parser thus allowing the usual declaration of strings by quotes. Other operators as "Set Macro" don't allow this - here you have to enter the desired value directly. Setting the value to "" doesn't result in an empty string but generates a string value containing two quotes.
    Then you simply need to add quotes to the Branch operator's condition parameter to compare two string values correctly (here an expression parser is used again - as the condition type "expression" says): "%{test}" == "%{att1_value}". That's it - but it won't generate the results you want ;) You use att3_value in the Then-part of the Branch operator where it isn't defined yet.

    In my last example I simply built a process for solving the task you described but didn't really notice, that att3 only depends on the value of att1 (att2 is only needed for building the value). So you could build all valid values for att3 in a first step and then simply filter examples by att1 in a loop and set att3 for all examples. This shouldn't be too hard. Good luck ;)

    Regards,
    Matthias
  • Options
    RichyRichy Member Posts: 20 Contributor II
    Hi Matthias,

    Thank you very much for you answer. It helped me so much to understand how macros are working.

    I found a way to optimize a little bit your first code. I'm finally just using the "attribute_value_filter" condition in the branch operator. In this one, I'm using this condition : att1 = %{att1_value} [%{example}]
    So now, I'm only calculating the new value of att3 when I have a new value of att1. I have a real gain of time on a big dataset (6min to 3min).

    Here is my code:
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.0">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.0.8" expanded="true" name="Process">
        <process expanded="true" height="280" width="681">
          <operator activated="true" class="generate_nominal_data" compatibility="5.0.8" expanded="true" height="60" name="Generate Nominal Data" width="90" x="45" y="30">
            <parameter key="number_examples" value="10"/>
            <parameter key="number_of_attributes" value="2"/>
          </operator>
          <operator activated="true" class="generate_empty_attribute" compatibility="5.0.11" expanded="true" height="76" name="Generate Empty Attribute" width="90" x="179" y="30">
            <parameter key="name" value="att3"/>
            <parameter key="value_type" value="polynominal"/>
          </operator>
          <operator activated="true" class="sort" compatibility="5.0.11" expanded="true" height="76" name="Sort" width="90" x="313" y="30">
            <parameter key="attribute_name" value="att1"/>
          </operator>
          <operator activated="true" class="loop_examples" compatibility="5.0.8" expanded="true" height="76" name="Loop Examples" width="90" x="447" y="30">
            <process expanded="true" height="589" width="820">
              <operator activated="true" class="branch" compatibility="5.0.11" expanded="true" height="76" name="Branch" width="90" x="112" y="30">
                <parameter key="condition_type" value="attribute_value_filter"/>
                <parameter key="condition_value" value="att1 = %{att1_value} [%{example}]"/>
                <process expanded="true" height="505" width="346">
                  <operator activated="true" class="set_data" compatibility="5.0.11" expanded="true" height="76" name="Set Data (3)" width="90" x="112" y="30">
                    <parameter key="attribute_name" value="att3"/>
                    <parameter key="example_index" value="%{example}"/>
                    <parameter key="value" value="%{att3_value}"/>
                  </operator>
                  <connect from_port="condition" to_op="Set Data (3)" to_port="example set input"/>
                  <connect from_op="Set Data (3)" from_port="example set output" to_port="input 1"/>
                  <portSpacing port="source_condition" spacing="0"/>
                  <portSpacing port="source_input 1" spacing="0"/>
                  <portSpacing port="sink_input 1" spacing="0"/>
                  <portSpacing port="sink_input 2" spacing="0"/>
                </process>
                <process expanded="true" height="505" width="415">
                  <operator activated="true" class="extract_macro" compatibility="5.0.11" expanded="true" height="60" name="Extract Macro (2)" width="90" x="45" y="30">
                    <parameter key="macro" value="att1_value"/>
                    <parameter key="macro_type" value="data_value"/>
                    <parameter key="attribute_name" value="att1"/>
                    <parameter key="example_index" value="%{example}"/>
                  </operator>
                  <operator activated="true" class="generate_macro" compatibility="5.0.11" expanded="true" height="76" name="Generate Macro" width="90" x="179" y="30">
                    <list key="function_descriptions">
                      <parameter key="att3_value" value="&quot;&quot;"/>
                    </list>
                  </operator>
                  <operator activated="true" class="filter_examples" compatibility="5.0.11" expanded="true" height="76" name="Filter Examples (2)" width="90" x="112" y="210">
                    <parameter key="condition_class" value="attribute_value_filter"/>
                    <parameter key="parameter_string" value="att1 = %{att1_value}"/>
                  </operator>
                  <operator activated="true" class="loop_examples" compatibility="5.0.11" expanded="true" height="76" name="Loop Examples (3)" width="90" x="246" y="210">
                    <parameter key="iteration_macro" value="example_inner"/>
                    <process expanded="true" height="505" width="839">
                      <operator activated="true" class="extract_macro" compatibility="5.0.11" expanded="true" height="60" name="Extract Macro (4)" width="90" x="45" y="30">
                        <parameter key="macro" value="att2_value"/>
                        <parameter key="macro_type" value="data_value"/>
                        <parameter key="attribute_name" value="att2"/>
                        <parameter key="example_index" value="%{example_inner}"/>
                      </operator>
                      <operator activated="true" class="generate_macro" compatibility="5.0.11" expanded="true" height="76" name="Generate Macro (3)" width="90" x="442" y="30">
                        <list key="function_descriptions">
                          <parameter key="att3_value" value="&quot;%{att3_value}%{att2_value}&quot;"/>
                        </list>
                      </operator>
                      <connect from_port="example set" to_op="Extract Macro (4)" to_port="example set"/>
                      <connect from_op="Extract Macro (4)" from_port="example set" to_op="Generate Macro (3)" to_port="through 1"/>
                      <connect from_op="Generate Macro (3)" from_port="through 1" to_port="example set"/>
                      <portSpacing port="source_example set" spacing="0"/>
                      <portSpacing port="sink_example set" spacing="0"/>
                      <portSpacing port="sink_output 1" spacing="0"/>
                    </process>
                  </operator>
                  <operator activated="true" class="set_data" compatibility="5.0.11" expanded="true" height="76" name="Set Data (2)" width="90" x="246" y="300">
                    <parameter key="attribute_name" value="att3"/>
                    <parameter key="example_index" value="%{example}"/>
                    <parameter key="value" value="%{att3_value}"/>
                  </operator>
                  <connect from_port="condition" to_op="Extract Macro (2)" to_port="example set"/>
                  <connect from_op="Extract Macro (2)" from_port="example set" to_op="Generate Macro" to_port="through 1"/>
                  <connect from_op="Generate Macro" from_port="through 1" to_op="Filter Examples (2)" to_port="example set input"/>
                  <connect from_op="Filter Examples (2)" from_port="example set output" to_op="Loop Examples (3)" to_port="example set"/>
                  <connect from_op="Filter Examples (2)" from_port="original" to_op="Set Data (2)" to_port="example set input"/>
                  <connect from_op="Set Data (2)" from_port="example set output" to_port="input 1"/>
                  <portSpacing port="source_condition" spacing="0"/>
                  <portSpacing port="source_input 1" spacing="0"/>
                  <portSpacing port="sink_input 1" spacing="0"/>
                  <portSpacing port="sink_input 2" spacing="0"/>
                </process>
              </operator>
              <connect from_port="example set" to_op="Branch" to_port="condition"/>
              <connect from_op="Branch" from_port="input 1" to_port="example set"/>
              <portSpacing port="source_example set" spacing="0"/>
              <portSpacing port="sink_example set" spacing="0"/>
              <portSpacing port="sink_output 1" spacing="0"/>
            </process>
          </operator>
          <connect from_op="Generate Nominal Data" from_port="output" to_op="Generate Empty Attribute" to_port="example set input"/>
          <connect from_op="Generate Empty Attribute" from_port="example set output" to_op="Sort" to_port="example set input"/>
          <connect from_op="Sort" from_port="example set output" to_op="Loop Examples" to_port="example set"/>
          <connect from_op="Loop Examples" from_port="example set" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    If anybody have a better idea, I'll take it.
  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi all,
    would be cool, if you could contribute your processes to myExperiment! That's an easy way to reuse them for other users.

    Greetings,
      Sebastian
  • Options
    RichyRichy Member Posts: 20 Contributor II
    Hi Sebastian,

    This script is uploaded.
  • Options
    wesselwessel Member Posts: 537 Maven
    You can also use the script operator.
    You have to make sure that att3 exists.
    For this you can use the Generate Empty Attribute operator.

    ExampleSet es = operator.getInput(ExampleSet.class);

    for (Example e : es) {
    e["att3"] = e["att1"] + " " + e["att2"];
    }

    return es;
  • Options
    rachel_lomaskyrachel_lomasky Member Posts: 52 Guru

    This is ugly (it would be nice if the Aggregate operator had the option to pick the concatenation character), but this takes a list like:

     

    my_string

    ________

    giants

    patriots

    lions

    eagles

     

    And returns: 'giants','patriots','lions','eagles'

     

    <?xml version="1.0" encoding="UTF-8"?><process version="7.4.000">
    <operator activated="true" class="select_attributes" compatibility="7.4.000" expanded="true" height="82" name="Get just my_string" width="90" x="313" y="34">
    <parameter key="attribute_filter_type" value="single"/>
    <parameter key="attribute" value="my_string"/>
    <parameter key="attributes" value=""/>
    <parameter key="use_except_expression" value="false"/>
    <parameter key="value_type" value="attribute_value"/>
    <parameter key="use_value_type_exception" value="false"/>
    <parameter key="except_value_type" value="time"/>
    <parameter key="block_type" value="attribute_block"/>
    <parameter key="use_block_type_exception" value="false"/>
    <parameter key="except_block_type" value="value_matrix_row_start"/>
    <parameter key="invert_selection" value="false"/>
    <parameter key="include_special_attributes" value="false"/>
    </operator>
    </process>
    <?xml version="1.0" encoding="UTF-8"?><process version="7.4.000">
    <operator activated="true" class="generate_attributes" compatibility="7.4.000" expanded="true" height="82" name="Quote" width="90" x="447" y="34">
    <list key="function_descriptions">
    <parameter key="my_string" value="concat(&quot;'&quot;,my_string,&quot;'&quot;)"/>
    </list>
    <parameter key="keep_all" value="true"/>
    </operator>
    </process>
    <?xml version="1.0" encoding="UTF-8"?><process version="7.4.000">
    <operator activated="true" class="aggregate" compatibility="7.4.000" expanded="true" height="82" name="Aggregate" width="90" x="581" y="34">
    <parameter key="use_default_aggregation" value="false"/>
    <parameter key="attribute_filter_type" value="all"/>
    <parameter key="attribute" value=""/>
    <parameter key="attributes" value=""/>
    <parameter key="use_except_expression" value="false"/>
    <parameter key="value_type" value="attribute_value"/>
    <parameter key="use_value_type_exception" value="false"/>
    <parameter key="except_value_type" value="time"/>
    <parameter key="block_type" value="attribute_block"/>
    <parameter key="use_block_type_exception" value="false"/>
    <parameter key="except_block_type" value="value_matrix_row_start"/>
    <parameter key="invert_selection" value="false"/>
    <parameter key="include_special_attributes" value="false"/>
    <parameter key="default_aggregation_function" value="average"/>
    <list key="aggregation_attributes">
    <parameter key="my_string" value="concatenation"/>
    </list>
    <parameter key="group_by_attributes" value=""/>
    <parameter key="count_all_combinations" value="false"/>
    <parameter key="only_distinct" value="false"/>
    <parameter key="ignore_missings" value="true"/>
    </operator>
    </process>
    <?xml version="1.0" encoding="UTF-8"?><process version="7.4.000">
    <operator activated="true" class="replace" compatibility="7.4.000" expanded="true" height="82" name="Replace" width="90" x="715" y="34">
    <parameter key="attribute_filter_type" value="all"/>
    <parameter key="attribute" value=""/>
    <parameter key="attributes" value=""/>
    <parameter key="use_except_expression" value="false"/>
    <parameter key="value_type" value="nominal"/>
    <parameter key="use_value_type_exception" value="false"/>
    <parameter key="except_value_type" value="file_path"/>
    <parameter key="block_type" value="single_value"/>
    <parameter key="use_block_type_exception" value="false"/>
    <parameter key="except_block_type" value="single_value"/>
    <parameter key="invert_selection" value="false"/>
    <parameter key="include_special_attributes" value="false"/>
    <parameter key="replace_what" value="\|"/>
    <parameter key="replace_by" value=","/>
    </operator>
    </process>
  • Options
    rachel_lomaskyrachel_lomasky Member Posts: 52 Guru

    Or as a process that takes a macro called "attribute" as the feature and returns an attribute called "list"

     

    <?xml version="1.0" encoding="UTF-8"?><process version="7.4.000">
    <operator activated="true" class="generate_attributes" compatibility="7.4.000" expanded="true" height="82" name="Quote" width="90" x="179" y="34">
    <list key="function_descriptions">
    <parameter key="%{attribute}" value="concat(&quot;'&quot;,eval(%{attribute}),&quot;'&quot;)"/>
    </list>
    <parameter key="keep_all" value="true"/>
    </operator>
    </process>
    <?xml version="1.0" encoding="UTF-8"?><process version="7.4.000">
    <operator activated="true" class="aggregate" compatibility="7.4.000" expanded="true" height="82" name="Aggregate" width="90" x="313" y="34">
    <parameter key="use_default_aggregation" value="false"/>
    <parameter key="attribute_filter_type" value="all"/>
    <parameter key="attribute" value=""/>
    <parameter key="attributes" value=""/>
    <parameter key="use_except_expression" value="false"/>
    <parameter key="value_type" value="attribute_value"/>
    <parameter key="use_value_type_exception" value="false"/>
    <parameter key="except_value_type" value="time"/>
    <parameter key="block_type" value="attribute_block"/>
    <parameter key="use_block_type_exception" value="false"/>
    <parameter key="except_block_type" value="value_matrix_row_start"/>
    <parameter key="invert_selection" value="false"/>
    <parameter key="include_special_attributes" value="false"/>
    <parameter key="default_aggregation_function" value="average"/>
    <list key="aggregation_attributes">
    <parameter key="%{attribute}" value="concatenation"/>
    </list>
    <parameter key="group_by_attributes" value=""/>
    <parameter key="count_all_combinations" value="false"/>
    <parameter key="only_distinct" value="false"/>
    <parameter key="ignore_missings" value="true"/>
    </operator>
    </process>
    <?xml version="1.0" encoding="UTF-8"?><process version="7.4.000">
    <operator activated="true" class="replace" compatibility="7.4.000" expanded="true" height="82" name="Replace" width="90" x="447" y="34">
    <parameter key="attribute_filter_type" value="all"/>
    <parameter key="attribute" value=""/>
    <parameter key="attributes" value=""/>
    <parameter key="use_except_expression" value="false"/>
    <parameter key="value_type" value="nominal"/>
    <parameter key="use_value_type_exception" value="false"/>
    <parameter key="except_value_type" value="file_path"/>
    <parameter key="block_type" value="single_value"/>
    <parameter key="use_block_type_exception" value="false"/>
    <parameter key="except_block_type" value="single_value"/>
    <parameter key="invert_selection" value="false"/>
    <parameter key="include_special_attributes" value="false"/>
    <parameter key="replace_what" value="\|"/>
    <parameter key="replace_by" value=","/>
    </operator>
    </process>
    <?xml version="1.0" encoding="UTF-8"?><process version="7.4.000">
    <operator activated="true" class="rename" compatibility="7.4.000" expanded="true" height="82" name="Rename" width="90" x="581" y="34">
    <parameter key="old_name" value="concat(%{attribute})"/>
    <parameter key="new_name" value="list"/>
    <list key="rename_additional_attributes"/>
    </operator>
    </process>
Sign In or Register to comment.