Options

"[Solved] Aggregation and generate Attribute"

frasfras Member Posts: 93 Contributor II
edited June 2019 in Help
Hi !
Maybe I missed something obvious:
It seems to me that it is not possible to postprocess an
attribute coming from an aggregation task.
In the attached workflow I would like to do some calculations
on "sum(attr1)" but it fails. Do I need macros ?
Thanks for some hints in advance !

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.008">
 <context>
   <input/>
   <output/>
   <macros/>
 </context>
 <operator activated="true" class="process" compatibility="5.2.008" expanded="true" name="Process">
   <process expanded="true" height="431" width="882">
     <operator activated="true" class="generate_data" compatibility="5.2.008" expanded="true" height="60" name="Generate Data" width="90" x="45" y="210">
       <parameter key="target_function" value="random classification"/>
     </operator>
     <operator activated="true" class="aggregate" compatibility="5.2.008" expanded="true" height="76" name="Aggregate" width="90" x="179" y="75">
       <list key="aggregation_attributes">
         <parameter key="att1" value="sum"/>
       </list>
       <parameter key="group_by_attributes" value="label|"/>
     </operator>
     <operator activated="true" class="generate_attributes" compatibility="5.2.008" expanded="true" height="76" name="Generate Attributes" width="90" x="447" y="75">
       <list key="function_descriptions">
         <parameter key="new_attrib" value="sum(att1) * 100"/>
       </list>
     </operator>
     <connect from_op="Generate Data" from_port="output" to_op="Aggregate" to_port="example set input"/>
     <connect from_op="Aggregate" from_port="example set output" to_op="Generate Attributes" to_port="example set input"/>
     <connect from_op="Generate Attributes" from_port="example set output" to_port="result 1"/>
     <connect from_op="Generate Attributes" from_port="original" to_port="result 2"/>
     <portSpacing port="source_input 1" spacing="0"/>
     <portSpacing port="sink_result 1" spacing="0"/>
     <portSpacing port="sink_result 2" spacing="0"/>
     <portSpacing port="sink_result 3" spacing="0"/>
   </process>
 </operator>
</process>


Answers

  • Options
    MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Generate Attributes and other operators using formula editors can't handle attribute names with spaces, braces, parentheses etc. You have to rename the attributes. You can do that manually with Rename, or rename several attributes at once with Rename by Replace. Please see the attached process for an example.

    Happy Mining!
    ~Marius
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.2.008">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.2.008" expanded="true" name="Process">
        <process expanded="true" height="527" width="658">
          <operator activated="true" breakpoints="after" class="generate_data_user_specification" compatibility="5.2.008" expanded="true" height="60" name="Generate Data by User Specification" width="90" x="112" y="30">
            <list key="attribute_values">
              <parameter key="a(1)" value="&quot;zipp&quot;"/>
              <parameter key="an attribute with a space and (parentheses)" value="&quot;zapp&quot;"/>
            </list>
            <list key="set_additional_roles"/>
          </operator>
          <operator activated="true" class="rename_by_replacing" compatibility="5.2.008" expanded="true" height="76" name="Rename by Replacing" width="90" x="246" y="30">
            <parameter key="include_special_attributes" value="true"/>
            <parameter key="replace_what" value="\(|\)| "/>
            <parameter key="replace_by" value="_"/>
          </operator>
          <connect from_op="Generate Data by User Specification" from_port="output" to_op="Rename by Replacing" to_port="example set input"/>
          <connect from_op="Rename by Replacing" from_port="example set output" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
Sign In or Register to comment.