[SOLVED] outer Join

tennenrishintennenrishin Member Posts: 177 Contributor II
edited November 2019 in Help
b in row 2 is missing in the output from this example. Is that correct behavior for outer join?
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.008">
 <operator activated="true" class="process" compatibility="5.2.008" expanded="true" name="Process">
   <process expanded="true" height="659" width="1043">
     <operator activated="true" class="generate_data_user_specification" compatibility="5.2.008" expanded="true" height="60" name="a=1 b=1" width="90" x="45" y="120">
       <list key="attribute_values">
         <parameter key="a" value="1"/>
         <parameter key="b" value="1"/>
       <list key="set_additional_roles"/>
     <operator activated="true" class="generate_data_user_specification" compatibility="5.2.008" expanded="true" height="60" name="a=2 b=2" width="90" x="45" y="255">
       <list key="attribute_values">
         <parameter key="a" value="2"/>
         <parameter key="b" value="2"/>
       <list key="set_additional_roles"/>
     <operator activated="true" class="join" compatibility="5.2.008" expanded="true" height="76" name="Join" width="90" x="447" y="165">
       <parameter key="join_type" value="outer"/>
       <parameter key="use_id_attribute_as_key" value="false"/>
       <list key="key_attributes">
         <parameter key="a" value="a"/>
     <connect from_op="a=1 b=1" from_port="output" to_op="Join" to_port="left"/>
     <connect from_op="a=2 b=2" from_port="output" to_op="Join" to_port="right"/>
     <connect from_op="Join" from_port="join" to_port="result 1"/>
     <portSpacing port="source_input 1" spacing="0"/>
     <portSpacing port="sink_result 1" spacing="126"/>
     <portSpacing port="sink_result 2" spacing="72"/>


  • Options
    MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn

    this is a bit confusing because you named the second attribute ("b") identical in both sets.
    What happens is the following: you have enabled the parameter "remove_double_attributes", that means that if you have equally named attributes on the left side and the right side, Rapid Miner always uses the one from the left side. Since for a=2 you don't have a value for b on the left, it is missing.
    If you disable the aforementioned parameter, everything will be as expected.

    Happy Mining!
Sign In or Register to comment.