Possible bug with Aggregate operator

lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 748   Unicorn
edited June 14 in Help

Dear all,

 

I think I found a "fun" bug in the Aggregate operator by playing with the now famous Titanic dataset : 

I wanted to calculate the count of the label "Survived" (Yes / No) : 

So I set the following parameters : 

 - Aggregation attribute : Survived

 - Aggregation functions : count

 - Group by attributes : Survived

 

When executed, the process raises the following error :

Titanic_aggregate.png

 

When the label "Survived" is set as regular (using the Set Role operator), the process works fine : 

Titanic_aggregate_2.png

 

It is possible that this bug has been reintroduced because a priori the process was used (without error) in the documentation

about "Create Web Apps"

Titanic_aggregate_3.png

 

Regards,

 

 

Lionel

 

NB : the process : 

<?xml version="1.0" encoding="UTF-8"?><process version="8.2.001">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="8.2.001" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="8.2.001" expanded="true" height="68" name="Retrieve Titanic Training" width="90" x="112" y="85">
<parameter key="repository_entry" value="//Samples/data/Titanic Training"/>
</operator>
<operator activated="false" class="set_role" compatibility="8.2.001" expanded="true" height="82" name="Set Role" width="90" x="246" y="136">
<parameter key="attribute_name" value="Survived"/>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="aggregate" compatibility="8.2.001" expanded="true" height="82" name="Aggregate" width="90" x="447" y="85">
<list key="aggregation_attributes">
<parameter key="Survived" value="count"/>
</list>
<parameter key="group_by_attributes" value="Survived"/>
</operator>
<connect from_op="Retrieve Titanic Training" from_port="output" to_op="Aggregate" to_port="example set input"/>
<connect from_op="Aggregate" from_port="example set output" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>

 

 

 

 

 

 

 

 

Best Answer

  • gmeiergmeier Posts: 12   RM Engineering
    Solution Accepted

    Hi @lionelderkrikor,

     

    This was fixed only recently and is only part of the newest 9.0.0 beta. It works with the version I downloaded from the beta page yesterday.

    If you cannot use the new beta yet, you can change the compatibility level of Aggregate back to 8.2.0.

Answers

  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 2,122  RM Data Scientist

    Hi @lionelderkrikor,

    this bug occurs when you aggregate on a special attribute. I think I've already reported it. Can you test if it's present in 9.0 BETA?

     

    BR,

    Martin

    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 748   Unicorn

    Hi @mschmitz,

     

    Indeed, I shared the process using RM 8.2, but I confirm that the bug is still present in RM 9.0 BETA.

     

    The process : 

    <?xml version="1.0" encoding="UTF-8"?><process version="9.0.000-BETA">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="9.0.000-BETA" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="retrieve" compatibility="9.0.000-BETA" expanded="true" height="68" name="Retrieve Titanic Training" width="90" x="112" y="85">
    <parameter key="repository_entry" value="//Samples/data/Titanic Training"/>
    </operator>
    <operator activated="false" class="set_role" compatibility="9.0.000-BETA" expanded="true" height="82" name="Set Role" width="90" x="246" y="136">
    <parameter key="attribute_name" value="Survived"/>
    <list key="set_additional_roles"/>
    </operator>
    <operator activated="true" class="aggregate" compatibility="9.0.000-BETA" expanded="true" height="82" name="Aggregate" width="90" x="447" y="85">
    <list key="aggregation_attributes">
    <parameter key="Survived" value="count"/>
    </list>
    <parameter key="group_by_attributes" value="Survived"/>
    </operator>
    <connect from_op="Retrieve Titanic Training" from_port="output" to_op="Aggregate" to_port="example set input"/>
    <connect from_op="Aggregate" from_port="example set output" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>

    Regards,

     

    Lionel

  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 748   Unicorn

    Hi @gmeier,

     

    Effectively, I work with RM Studio 9.0 Beta since 18th July. I updated today RM studio with the newest version 9.0 Beta4

    and i confirm that this bug is fixed.

     

    Thanks you for your reply,

     

    Regards,

     

     

    Lionel

    mschmitzgmeier
Sign In or Register to comment.