[solved] Fill missing value

greggreg Member Posts: 23 Contributor II
edited November 2018 in Help
my problem is : on some rows, the value of attr1 is missing ; on these rows only, I'd like to fill this value with a formula, such as "if (attr2 > 10) then (attr2 - 10) else (attr3 / 2)".

can I do that without adding a new attribute to all rows and use the  "multiple" operator?




  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Hi, that's not possible. You can use something like the process below. Note that the Rename by Replace operator in the beginning replaces all minus "-" in the attribute names to "_" to not confuse the expression parser in Generate Attributes.

    Best, Marius
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.1.015">
      <operator activated="true" class="process" compatibility="5.1.015" expanded="true" name="Process">
        <process expanded="true" height="280" width="681">
          <operator activated="true" class="retrieve" compatibility="5.1.015" expanded="true" height="60" name="Retrieve" width="90" x="45" y="30">
            <parameter key="repository_entry" value="//Samples/data/Labor-Negotiations"/>
          <operator activated="true" class="rename_by_replacing" compatibility="5.1.015" expanded="true" height="76" name="Rename by Replacing" width="90" x="179" y="30">
            <parameter key="include_special_attributes" value="true"/>
            <parameter key="replace_what" value="-"/>
            <parameter key="replace_by" value="_"/>
          <operator activated="true" class="generate_attributes" compatibility="5.1.015" expanded="true" height="76" name="Generate Attributes" width="90" x="313" y="30">
            <list key="function_descriptions">
              <parameter key="wage_inc_3rd_replacement" value="if(missing(wage_inc_3rd),if(duration&gt;2,duration-10,duration/3),wage_inc_3rd)"/>
          <operator activated="true" class="select_attributes" compatibility="5.1.015" expanded="true" height="76" name="Select Attributes" width="90" x="447" y="30">
            <parameter key="attribute_filter_type" value="single"/>
            <parameter key="attribute" value="wage_inc_3rd"/>
            <parameter key="invert_selection" value="true"/>
          <operator activated="true" class="rename" compatibility="5.1.015" expanded="true" height="76" name="Rename" width="90" x="581" y="30">
            <parameter key="old_name" value="wage_inc_3rd_replacement"/>
            <parameter key="new_name" value="wage_inc_3rd"/>
            <list key="rename_additional_attributes"/>
          <connect from_op="Retrieve" from_port="output" to_op="Rename by Replacing" to_port="example set input"/>
          <connect from_op="Rename by Replacing" from_port="example set output" to_op="Generate Attributes" to_port="example set input"/>
          <connect from_op="Generate Attributes" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
          <connect from_op="Select Attributes" from_port="example set output" to_op="Rename" to_port="example set input"/>
          <connect from_op="Rename" from_port="example set output" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
  • greggreg Member Posts: 23 Contributor II
    Thanks for your answer, I'm beginning to understand how to work with rapidminer.

    About attribute names, I have another question, but I'll start another topic.
Sign In or Register to comment.