How can i generate new attribute giving some condition.

pallavpallav Member Posts: 21 Contributor II
edited February 25 in Help
Say suppose we have dataset having attribute procesing_month and month1.   I want to generate new attribute if value of processing_month is "1" then dont create any attribute. If the value is "2" then create new attribute "month2", if value is "3" then create 2 attribute "Month2" and "month3" if value is 4 then create 3 attribute "month2" , "Month3" and "month4".





Jasmine_

Best Answers

  • pallavpallav Posts: 21 Contributor II
    Solution Accepted
    @sgenzer - Thanks that works  good for me.

Answers

  • varunm1varunm1 Moderator, Member Posts: 1,115   Unicorn
    Hello @pallav

    You can use generate attribute operator, inside this you can give nested if statements

    if(processing_month == 1, 1, if(processing_month == 2, "month 2", if(processing_month == 3, "month 2 and month 3"", if(processimg_month == 4, "month 2, month 3 and month 4", ))))))

    Is this what you are looking for?

    Please let us know if you need a different thing.
    Regards,
    Varun
    https://www.varunmandalapu.com/

    Be Safe. Follow precautions and Maintain Social Distancing

    sgenzerJasmine_
  • pallavpallav Member Posts: 21 Contributor II
    edited February 25
    But it seems we cannot leave attribute name as blank. we have to put some value there, in this case how can we give any value. it depend upon the processing month weather attribute will be generated or not .
    Jasmine_
  • sgenzersgenzer 12Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,860  Community Manager
    edited February 25
    @pallav how does this work for you?

    <?xml version="1.0" encoding="UTF-8"?><process version="9.6.000-RC">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="9.6.000-RC" expanded="true" name="Process">
        <parameter key="logverbosity" value="init"/>
        <parameter key="random_seed" value="-1"/>
        <parameter key="send_mail" value="never"/>
        <parameter key="notification_email" value=""/>
        <parameter key="process_duration_for_mail" value="30"/>
        <parameter key="encoding" value="SYSTEM"/>
        <process expanded="true">
          <operator activated="true" breakpoints="after" class="subprocess" compatibility="9.6.000-RC" expanded="true" height="82" name="Subprocess" width="90" x="45" y="34">
            <process expanded="true">
              <operator activated="true" class="generate_data" compatibility="9.6.000-RC" expanded="true" height="68" name="Generate Data" width="90" x="45" y="34">
                <parameter key="target_function" value="random"/>
                <parameter key="number_examples" value="100"/>
                <parameter key="number_of_attributes" value="1"/>
                <parameter key="attributes_lower_bound" value="-10.0"/>
                <parameter key="attributes_upper_bound" value="10.0"/>
                <parameter key="gaussian_standard_deviation" value="10.0"/>
                <parameter key="largest_radius" value="10.0"/>
                <parameter key="use_local_random_seed" value="false"/>
                <parameter key="local_random_seed" value="1992"/>
                <parameter key="datamanagement" value="double_array"/>
                <parameter key="data_management" value="auto"/>
              </operator>
              <operator activated="true" class="generate_attributes" compatibility="9.6.000-RC" expanded="true" height="82" name="Generate Attributes" width="90" x="179" y="34">
                <list key="function_descriptions">
                  <parameter key="processing_month" value="rint(abs(10*label))"/>
                  <parameter key="Month_1" value="rint(abs(1000*att1))"/>
                </list>
                <parameter key="keep_all" value="true"/>
              </operator>
              <operator activated="true" class="select_attributes" compatibility="9.6.000-RC" expanded="true" height="82" name="Select Attributes" width="90" x="313" y="34">
                <parameter key="attribute_filter_type" value="subset"/>
                <parameter key="attribute" value=""/>
                <parameter key="attributes" value="Month_1|processing_month"/>
                <parameter key="use_except_expression" value="false"/>
                <parameter key="value_type" value="attribute_value"/>
                <parameter key="use_value_type_exception" value="false"/>
                <parameter key="except_value_type" value="time"/>
                <parameter key="block_type" value="attribute_block"/>
                <parameter key="use_block_type_exception" value="false"/>
                <parameter key="except_block_type" value="value_matrix_row_start"/>
                <parameter key="invert_selection" value="false"/>
                <parameter key="include_special_attributes" value="true"/>
              </operator>
              <operator activated="true" class="numerical_to_polynominal" compatibility="9.6.000-RC" expanded="true" height="82" name="Numerical to Polynominal" width="90" x="447" y="34">
                <parameter key="attribute_filter_type" value="single"/>
                <parameter key="attribute" value="processing_month"/>
                <parameter key="attributes" value=""/>
                <parameter key="use_except_expression" value="false"/>
                <parameter key="value_type" value="numeric"/>
                <parameter key="use_value_type_exception" value="false"/>
                <parameter key="except_value_type" value="real"/>
                <parameter key="block_type" value="value_series"/>
                <parameter key="use_block_type_exception" value="false"/>
                <parameter key="except_block_type" value="value_series_end"/>
                <parameter key="invert_selection" value="false"/>
                <parameter key="include_special_attributes" value="false"/>
              </operator>
              <operator activated="true" class="order_attributes" compatibility="9.6.000-RC" expanded="true" height="82" name="Reorder Attributes" width="90" x="581" y="34">
                <parameter key="sort_mode" value="user specified"/>
                <parameter key="attribute_ordering" value="processing_month|Month_1"/>
                <parameter key="use_regular_expressions" value="false"/>
                <parameter key="handle_unmatched" value="append"/>
                <parameter key="sort_direction" value="ascending"/>
              </operator>
              <connect from_op="Generate Data" from_port="output" to_op="Generate Attributes" to_port="example set input"/>
              <connect from_op="Generate Attributes" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
              <connect from_op="Select Attributes" from_port="example set output" to_op="Numerical to Polynominal" to_port="example set input"/>
              <connect from_op="Numerical to Polynominal" from_port="example set output" to_op="Reorder Attributes" to_port="example set input"/>
              <connect from_op="Reorder Attributes" from_port="example set output" to_port="out 1"/>
              <portSpacing port="source_in 1" spacing="0"/>
              <portSpacing port="sink_out 1" spacing="0"/>
              <portSpacing port="sink_out 2" spacing="0"/>
            </process>
            <description align="center" color="transparent" colored="false" width="126">sample data</description>
          </operator>
          <operator activated="true" class="concurrency:loop_values" compatibility="9.6.000-RC" expanded="true" height="82" name="Loop Values" width="90" x="179" y="34">
            <parameter key="attribute" value="processing_month"/>
            <parameter key="iteration_macro" value="processing_month"/>
            <parameter key="reuse_results" value="false"/>
            <parameter key="enable_parallel_execution" value="false"/>
            <process expanded="true">
              <operator activated="true" class="filter_examples" compatibility="9.6.000-RC" expanded="true" height="103" name="Filter Examples" width="90" x="45" y="34">
                <parameter key="parameter_expression" value=""/>
                <parameter key="condition_class" value="custom_filters"/>
                <parameter key="invert_filter" value="false"/>
                <list key="filters_list">
                  <parameter key="filters_entry_key" value="processing_month.equals.%{processing_month}"/>
                </list>
                <parameter key="filters_logic_and" value="true"/>
                <parameter key="filters_check_metadata" value="true"/>
                <description align="center" color="transparent" colored="false" width="126">filter for %{processing_month}</description>
              </operator>
              <operator activated="true" class="branch" compatibility="9.6.000-RC" expanded="true" height="82" name="Branch" width="90" x="179" y="34">
                <parameter key="condition_type" value="expression"/>
                <parameter key="expression" value="if(%{processing_month}==1,TRUE,FALSE)"/>
                <parameter key="io_object" value="ANOVAMatrix"/>
                <parameter key="return_inner_output" value="true"/>
                <process expanded="true">
                  <connect from_port="condition" to_port="input 1"/>
                  <portSpacing port="source_condition" spacing="0"/>
                  <portSpacing port="source_input 1" spacing="0"/>
                  <portSpacing port="sink_input 1" spacing="0"/>
                  <portSpacing port="sink_input 2" spacing="0"/>
                  <description align="center" color="yellow" colored="false" height="105" resized="false" width="180" x="129" y="20">%{processing_month} = 1</description>
                </process>
                <process expanded="true">
                  <operator activated="true" class="generate_macro" compatibility="9.6.000-RC" expanded="true" height="82" name="Generate Macro" width="90" x="112" y="85">
                    <list key="function_descriptions">
                      <parameter key="newName" value="concat(&quot;Month_&quot;,%{processing_month})"/>
                    </list>
                    <description align="center" color="transparent" colored="false" width="126">%{newName}</description>
                  </operator>
                  <operator activated="true" class="rename" compatibility="9.6.000-RC" expanded="true" height="82" name="Rename" width="90" x="246" y="85">
                    <parameter key="old_name" value="Month_1"/>
                    <parameter key="new_name" value="%{newName}"/>
                    <list key="rename_additional_attributes"/>
                  </operator>
                  <connect from_port="condition" to_op="Generate Macro" to_port="through 1"/>
                  <connect from_op="Generate Macro" from_port="through 1" to_op="Rename" to_port="example set input"/>
                  <connect from_op="Rename" from_port="example set output" to_port="input 1"/>
                  <portSpacing port="source_condition" spacing="0"/>
                  <portSpacing port="source_input 1" spacing="0"/>
                  <portSpacing port="sink_input 1" spacing="0"/>
                  <portSpacing port="sink_input 2" spacing="0"/>
                  <description align="center" color="yellow" colored="false" height="105" resized="false" width="180" x="132" y="20">all other cases</description>
                </process>
              </operator>
              <connect from_port="input 1" to_op="Filter Examples" to_port="example set input"/>
              <connect from_op="Filter Examples" from_port="example set output" to_op="Branch" to_port="condition"/>
              <connect from_op="Branch" from_port="input 1" to_port="output 1"/>
              <portSpacing port="source_input 1" spacing="0"/>
              <portSpacing port="source_input 2" spacing="0"/>
              <portSpacing port="sink_output 1" spacing="0"/>
              <portSpacing port="sink_output 2" spacing="0"/>
            </process>
            <description align="center" color="transparent" colored="false" width="126">loop by processing_month</description>
          </operator>
          <operator activated="true" class="operator_toolbox:advanced_append" compatibility="2.3.000" expanded="true" height="82" name="Append (Superset)" width="90" x="313" y="34"/>
          <connect from_op="Subprocess" from_port="out 1" to_op="Loop Values" to_port="input 1"/>
          <connect from_op="Loop Values" from_port="output 1" to_op="Append (Superset)" to_port="example set 1"/>
          <connect from_op="Append (Superset)" from_port="merged set" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    



    Jasmine_
  • pallavpallav Member Posts: 21 Contributor II
    edited February 26
    @sgenzer- How it will work when i say i have both month_1 and month_2 in my dataset i want only month_3 to be created. basically it should create all the column based on given number of month column and taking corresponding processing_month value if that is already present then dont create anything if that is not present create a column for that.. In this process it is giving following error






    Jasmine_
  • sgenzersgenzer 12Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,860  Community Manager
    edited February 26
    ah well that is a different question! Let me get back to you with a new process.
    Jasmine_
  • pallavpallav Member Posts: 21 Contributor II
    @sgenzer- Thanks thats works great for me.
    Jasmine_sgenzer
Sign In or Register to comment.