RapidMiner 9.7 is Now Available

Lots of amazing new improvements including true version control! Learn more about what's new here.

CLICK HERE TO DOWNLOAD

How to filter example values based on the attribute names ?

lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,068   Unicorn
edited March 12 in Help
Dear all,

I propose an easy challenge in appearance (in appearance only)...  After many tests, I must admit that 
I'm totally stuck.

My process produces an example set with some attributes (the number of generated attributes is variable according to the set-up of the user) : 


For a given attribute, I want to filter the example values based on the attribute name : 
 - if the example value contains the attribute name, then the example is conserved.
 -  if the example value does not contain the attribute name, then the example is removed.
To perform this task, I try to used a Filter Examples operator inside a Loop Attributes subprocess with the following set-up : 
 - condition class = expression and parameter expression
  = > this solution produces an empty example set !

 - condition class = custom_filters and   filters = 


 = > this solution raises an error ("Instantiation error")


So I'm searching a 100 % native RapidMiner process (no Python / R scripts) to filter example values based on the 
attribute name. Is it possible ?

My data (.pdf file) is in attached file.
My process : 
<?xml version="1.0" encoding="UTF-8"?><process version="9.2.001">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="9.2.001" expanded="true" name="Process">
    <parameter key="logverbosity" value="init"/>
    <parameter key="random_seed" value="2001"/>
    <parameter key="send_mail" value="never"/>
    <parameter key="notification_email" value=""/>
    <parameter key="process_duration_for_mail" value="30"/>
    <parameter key="encoding" value="SYSTEM"/>
    <process expanded="true">
      <operator activated="true" class="text:read_document" compatibility="8.1.000" expanded="true" height="68" name="Read Document" width="90" x="45" y="85">
        <parameter key="file" value="C:\Users\Lionel\Downloads\FeatureSelection_MRMR_rapidminer.pdf"/>
        <parameter key="extract_text_only" value="true"/>
        <parameter key="use_file_extension_as_type" value="true"/>
        <parameter key="content_type" value="txt"/>
        <parameter key="encoding" value="SYSTEM"/>
      </operator>
      <operator activated="true" class="text:process_documents" compatibility="8.1.000" expanded="true" height="103" name="Process Documents" width="90" x="179" y="34">
        <parameter key="create_word_vector" value="true"/>
        <parameter key="vector_creation" value="TF-IDF"/>
        <parameter key="add_meta_information" value="true"/>
        <parameter key="keep_text" value="false"/>
        <parameter key="prune_method" value="none"/>
        <parameter key="prune_below_percent" value="3.0"/>
        <parameter key="prune_above_percent" value="30.0"/>
        <parameter key="prune_below_rank" value="0.05"/>
        <parameter key="prune_above_rank" value="0.95"/>
        <parameter key="datamanagement" value="double_sparse_array"/>
        <parameter key="data_management" value="auto"/>
        <process expanded="true">
          <operator activated="true" class="text:tokenize" compatibility="8.1.000" expanded="true" height="68" name="Tokenize" width="90" x="313" y="85">
            <parameter key="mode" value="linguistic sentences"/>
            <parameter key="characters" value=".:"/>
            <parameter key="language" value="English"/>
            <parameter key="max_token_length" value="3"/>
          </operator>
          <connect from_port="document" to_op="Tokenize" to_port="document"/>
          <connect from_op="Tokenize" from_port="document" to_port="document 1"/>
          <portSpacing port="source_document" spacing="0"/>
          <portSpacing port="sink_document 1" spacing="0"/>
          <portSpacing port="sink_document 2" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="multiply" compatibility="9.2.001" expanded="true" height="103" name="Multiply" width="90" x="413" y="34"/>
      <operator activated="true" class="select_attributes" compatibility="9.2.001" expanded="true" height="82" name="Select Attributes (2)" width="90" x="1988" y="238">
        <parameter key="attribute_filter_type" value="single"/>
        <parameter key="attribute" value="metadata_file"/>
        <parameter key="attributes" value=""/>
        <parameter key="use_except_expression" value="false"/>
        <parameter key="value_type" value="attribute_value"/>
        <parameter key="use_value_type_exception" value="false"/>
        <parameter key="except_value_type" value="time"/>
        <parameter key="block_type" value="attribute_block"/>
        <parameter key="use_block_type_exception" value="false"/>
        <parameter key="except_block_type" value="value_matrix_row_start"/>
        <parameter key="invert_selection" value="false"/>
        <parameter key="include_special_attributes" value="true"/>
      </operator>
      <operator activated="true" class="generate_id" compatibility="9.2.001" expanded="true" height="82" name="Generate ID (3)" width="90" x="2122" y="238">
        <parameter key="create_nominal_ids" value="false"/>
        <parameter key="offset" value="0"/>
      </operator>
      <operator activated="true" class="transpose" compatibility="9.2.001" expanded="true" height="82" name="Transpose" width="90" x="648" y="34"/>
      <operator activated="true" class="set_role" compatibility="9.2.001" expanded="true" height="82" name="Set Role" width="90" x="782" y="34">
        <parameter key="attribute_name" value="id"/>
        <parameter key="target_role" value="regular"/>
        <list key="set_additional_roles"/>
      </operator>
      <operator activated="true" class="rename" compatibility="9.2.001" expanded="true" height="82" name="Rename" width="90" x="916" y="34">
        <parameter key="old_name" value="id"/>
        <parameter key="new_name" value="text"/>
        <list key="rename_additional_attributes"/>
      </operator>
      <operator activated="true" class="generate_id" compatibility="9.2.001" expanded="true" height="82" name="Generate ID (2)" width="90" x="1050" y="34">
        <parameter key="create_nominal_ids" value="false"/>
        <parameter key="offset" value="0"/>
      </operator>
      <operator activated="true" class="operator_toolbox:create_exampleset" compatibility="2.0.001" expanded="true" height="68" name="Create ExampleSet (3)" origin="GENERATED_TUTORIAL" width="90" x="45" y="187">
        <parameter key="generator_type" value="comma_separated_text"/>
        <parameter key="number_of_examples" value="100"/>
        <parameter key="use_stepsize" value="false"/>
        <list key="function_descriptions"/>
        <parameter key="add_id_attribute" value="false"/>
        <list key="numeric_series_configuration"/>
        <list key="date_series_configuration"/>
        <list key="date_series_configuration (interval)"/>
        <parameter key="date_format" value="yyyy-MM-dd HH:mm:ss"/>
        <parameter key="input_csv_text" value="searched_item&#10;feature selection&#10;forward selection&#10;software revenue&#10;software license revenue&#10;software licenses revenue&#10;"/>
        <parameter key="column_separator" value=","/>
        <parameter key="parse_all_as_nominal" value="false"/>
        <parameter key="decimal_point_character" value="."/>
        <parameter key="trim_attribute_names" value="false"/>
        <parameter key="empty_strings_as_missing_values" value="true"/>
        <description align="center" color="transparent" colored="false" width="126">Enter your searched items here</description>
      </operator>
      <operator activated="true" class="concurrency:loop_values" compatibility="9.2.001" expanded="true" height="82" name="Loop Values" width="90" x="313" y="187">
        <parameter key="attribute" value="searched_item"/>
        <parameter key="iteration_macro" value="loop_value"/>
        <parameter key="reuse_results" value="true"/>
        <parameter key="enable_parallel_execution" value="true"/>
        <process expanded="true">
          <operator activated="true" class="generate_attributes" compatibility="9.2.001" expanded="true" height="82" name="Generate Attributes (2)" width="90" x="380" y="34">
            <list key="function_descriptions">
              <parameter key="%{loop_value}" value="1"/>
            </list>
            <parameter key="keep_all" value="true"/>
          </operator>
          <connect from_port="input 1" to_op="Generate Attributes (2)" to_port="example set input"/>
          <connect from_op="Generate Attributes (2)" from_port="example set output" to_port="output 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="source_input 2" spacing="0"/>
          <portSpacing port="sink_output 1" spacing="0"/>
          <portSpacing port="sink_output 2" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="generate_id" compatibility="9.2.001" expanded="true" height="82" name="Generate ID" width="90" x="715" y="187">
        <parameter key="create_nominal_ids" value="false"/>
        <parameter key="offset" value="0"/>
      </operator>
      <operator activated="true" class="concurrency:join" compatibility="9.2.001" expanded="true" height="82" name="Join" width="90" x="1184" y="85">
        <parameter key="remove_double_attributes" value="true"/>
        <parameter key="join_type" value="left"/>
        <parameter key="use_id_attribute_as_key" value="true"/>
        <list key="key_attributes"/>
        <parameter key="keep_both_join_attributes" value="false"/>
      </operator>
      <operator activated="true" class="concurrency:loop_attributes" compatibility="9.2.001" expanded="true" height="82" name="Loop Attributes" width="90" x="1318" y="85">
        <parameter key="attribute_filter_type" value="subset"/>
        <parameter key="attribute" value=""/>
        <parameter key="attributes" value="id|Macro|searched_item|text"/>
        <parameter key="use_except_expression" value="false"/>
        <parameter key="value_type" value="attribute_value"/>
        <parameter key="use_value_type_exception" value="false"/>
        <parameter key="except_value_type" value="time"/>
        <parameter key="block_type" value="attribute_block"/>
        <parameter key="use_block_type_exception" value="false"/>
        <parameter key="except_block_type" value="value_matrix_row_start"/>
        <parameter key="invert_selection" value="true"/>
        <parameter key="include_special_attributes" value="false"/>
        <parameter key="attribute_name_macro" value="loop_attribute"/>
        <parameter key="reuse_results" value="true"/>
        <parameter key="enable_parallel_execution" value="true"/>
        <process expanded="true">
          <operator activated="true" class="generate_attributes" compatibility="9.2.001" expanded="true" height="82" name="Generate Attributes (3)" width="90" x="380" y="34">
            <list key="function_descriptions">
              <parameter key="%{loop_attribute}" value="text"/>
            </list>
            <parameter key="keep_all" value="true"/>
          </operator>
          <connect from_port="input 1" to_op="Generate Attributes (3)" to_port="example set input"/>
          <connect from_op="Generate Attributes (3)" from_port="example set output" to_port="output 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="source_input 2" spacing="0"/>
          <portSpacing port="sink_output 1" spacing="0"/>
          <portSpacing port="sink_output 2" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="nominal_to_text" compatibility="9.2.001" expanded="true" height="82" name="Nominal to Text" width="90" x="1452" y="85">
        <parameter key="attribute_filter_type" value="all"/>
        <parameter key="attribute" value=""/>
        <parameter key="attributes" value=""/>
        <parameter key="use_except_expression" value="false"/>
        <parameter key="value_type" value="nominal"/>
        <parameter key="use_value_type_exception" value="false"/>
        <parameter key="except_value_type" value="file_path"/>
        <parameter key="block_type" value="single_value"/>
        <parameter key="use_block_type_exception" value="false"/>
        <parameter key="except_block_type" value="single_value"/>
        <parameter key="invert_selection" value="false"/>
        <parameter key="include_special_attributes" value="false"/>
      </operator>
      <operator activated="true" breakpoints="after" class="select_attributes" compatibility="9.2.001" expanded="true" height="82" name="Select Attributes" width="90" x="1586" y="85">
        <parameter key="attribute_filter_type" value="subset"/>
        <parameter key="attribute" value=""/>
        <parameter key="attributes" value="att_1|id|Macro|text|Value|searched_item"/>
        <parameter key="use_except_expression" value="false"/>
        <parameter key="value_type" value="attribute_value"/>
        <parameter key="use_value_type_exception" value="false"/>
        <parameter key="except_value_type" value="time"/>
        <parameter key="block_type" value="attribute_block"/>
        <parameter key="use_block_type_exception" value="false"/>
        <parameter key="except_block_type" value="value_matrix_row_start"/>
        <parameter key="invert_selection" value="true"/>
        <parameter key="include_special_attributes" value="true"/>
      </operator>
      <operator activated="true" class="concurrency:loop_attributes" compatibility="9.2.001" expanded="true" height="82" name="Loop Attributes (2)" width="90" x="1854" y="85">
        <parameter key="attribute_filter_type" value="subset"/>
        <parameter key="attribute" value=""/>
        <parameter key="attributes" value="id|Macro|text|Value"/>
        <parameter key="use_except_expression" value="false"/>
        <parameter key="value_type" value="attribute_value"/>
        <parameter key="use_value_type_exception" value="false"/>
        <parameter key="except_value_type" value="time"/>
        <parameter key="block_type" value="attribute_block"/>
        <parameter key="use_block_type_exception" value="false"/>
        <parameter key="except_block_type" value="value_matrix_row_start"/>
        <parameter key="invert_selection" value="true"/>
        <parameter key="include_special_attributes" value="false"/>
        <parameter key="attribute_name_macro" value="loop_attribute"/>
        <parameter key="reuse_results" value="false"/>
        <parameter key="enable_parallel_execution" value="true"/>
        <process expanded="true">
          <operator activated="true" breakpoints="after" class="filter_examples" compatibility="9.2.001" expanded="true" height="103" name="Filter Examples (2)" width="90" x="380" y="34">
            <parameter key="parameter_expression" value="contains(%{loop_attribute},#{loop_attribute})"/>
            <parameter key="condition_class" value="custom_filters"/>
            <parameter key="invert_filter" value="false"/>
            <list key="filters_list">
              <parameter key="filters_entry_key" value="%{loop_attribute}.contains.#{loop_attribute}"/>
            </list>
            <parameter key="filters_logic_and" value="false"/>
            <parameter key="filters_check_metadata" value="true"/>
          </operator>
          <connect from_port="input 1" to_op="Filter Examples (2)" to_port="example set input"/>
          <connect from_op="Filter Examples (2)" from_port="example set output" to_port="output 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="source_input 2" spacing="0"/>
          <portSpacing port="sink_output 1" spacing="0"/>
          <portSpacing port="sink_output 2" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="subprocess" compatibility="9.2.001" expanded="true" height="82" name="Union Append" origin="GENERATED_COMMUNITY" width="90" x="1988" y="85">
        <process expanded="true">
          <operator activated="true" class="loop_collection" compatibility="9.2.001" expanded="true" height="82" name="Output (4)" origin="GENERATED_COMMUNITY" width="90" x="45" y="34">
            <parameter key="set_iteration_macro" value="true"/>
            <parameter key="macro_name" value="iteration"/>
            <parameter key="macro_start_value" value="1"/>
            <parameter key="unfold" value="false"/>
            <process expanded="true">
              <operator activated="false" breakpoints="after" class="select" compatibility="9.2.001" expanded="true" height="68" name="Select (5)" origin="GENERATED_COMMUNITY" width="90" x="112" y="34">
                <parameter key="index" value="%{iteration}"/>
                <parameter key="unfold" value="false"/>
              </operator>
              <operator activated="true" class="branch" compatibility="9.2.001" expanded="true" height="82" name="Branch (2)" origin="GENERATED_COMMUNITY" width="90" x="313" y="34">
                <parameter key="condition_type" value="expression"/>
                <parameter key="expression" value="%{iteration}==1"/>
                <parameter key="io_object" value="ANOVAMatrix"/>
                <parameter key="return_inner_output" value="true"/>
                <process expanded="true">
                  <connect from_port="condition" to_port="input 1"/>
                  <portSpacing port="source_condition" spacing="0"/>
                  <portSpacing port="source_input 1" spacing="0"/>
                  <portSpacing port="sink_input 1" spacing="0"/>
                  <portSpacing port="sink_input 2" spacing="0"/>
                </process>
                <process expanded="true">
                  <operator activated="true" class="recall" compatibility="9.2.001" expanded="true" height="68" name="Recall (5)" origin="GENERATED_COMMUNITY" width="90" x="45" y="187">
                    <parameter key="name" value="LoopData"/>
                    <parameter key="io_object" value="ExampleSet"/>
                    <parameter key="remove_from_store" value="true"/>
                  </operator>
                  <operator activated="true" class="union" compatibility="9.2.001" expanded="true" height="82" name="Union (2)" origin="GENERATED_COMMUNITY" width="90" x="179" y="34"/>
                  <connect from_port="condition" to_op="Union (2)" to_port="example set 1"/>
                  <connect from_op="Recall (5)" from_port="result" to_op="Union (2)" to_port="example set 2"/>
                  <connect from_op="Union (2)" from_port="union" to_port="input 1"/>
                  <portSpacing port="source_condition" spacing="0"/>
                  <portSpacing port="source_input 1" spacing="0"/>
                  <portSpacing port="sink_input 1" spacing="0"/>
                  <portSpacing port="sink_input 2" spacing="0"/>
                </process>
              </operator>
              <operator activated="true" class="remember" compatibility="9.2.001" expanded="true" height="68" name="Remember (5)" origin="GENERATED_COMMUNITY" width="90" x="581" y="34">
                <parameter key="name" value="LoopData"/>
                <parameter key="io_object" value="ExampleSet"/>
                <parameter key="store_which" value="1"/>
                <parameter key="remove_from_process" value="true"/>
              </operator>
              <connect from_port="single" to_op="Branch (2)" to_port="condition"/>
              <connect from_op="Branch (2)" from_port="input 1" to_op="Remember (5)" to_port="store"/>
              <connect from_op="Remember (5)" from_port="stored" to_port="output 1"/>
              <portSpacing port="source_single" spacing="0"/>
              <portSpacing port="sink_output 1" spacing="0"/>
              <portSpacing port="sink_output 2" spacing="0"/>
            </process>
          </operator>
          <operator activated="true" class="select" compatibility="9.2.001" expanded="true" height="68" name="Select (6)" origin="GENERATED_COMMUNITY" width="90" x="179" y="34">
            <parameter key="index" value="%{iteration}"/>
            <parameter key="unfold" value="false"/>
          </operator>
          <connect from_port="in 1" to_op="Output (4)" to_port="collection"/>
          <connect from_op="Output (4)" from_port="output 1" to_op="Select (6)" to_port="collection"/>
          <connect from_op="Select (6)" from_port="selected" to_port="out 1"/>
          <portSpacing port="source_in 1" spacing="0"/>
          <portSpacing port="source_in 2" spacing="0"/>
          <portSpacing port="sink_out 1" spacing="0"/>
          <portSpacing port="sink_out 2" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="concurrency:loop_attributes" compatibility="9.2.001" expanded="true" height="82" name="Loop Attributes (3)" width="90" x="2122" y="85">
        <parameter key="attribute_filter_type" value="all"/>
        <parameter key="attribute" value=""/>
        <parameter key="attributes" value=""/>
        <parameter key="use_except_expression" value="false"/>
        <parameter key="value_type" value="attribute_value"/>
        <parameter key="use_value_type_exception" value="false"/>
        <parameter key="except_value_type" value="time"/>
        <parameter key="block_type" value="attribute_block"/>
        <parameter key="use_block_type_exception" value="false"/>
        <parameter key="except_block_type" value="value_matrix_row_start"/>
        <parameter key="invert_selection" value="false"/>
        <parameter key="include_special_attributes" value="false"/>
        <parameter key="attribute_name_macro" value="loop_attribute"/>
        <parameter key="reuse_results" value="true"/>
        <parameter key="enable_parallel_execution" value="true"/>
        <process expanded="true">
          <operator activated="true" class="remove_duplicates" compatibility="9.2.001" expanded="true" height="103" name="Remove Duplicates" width="90" x="380" y="34">
            <parameter key="attribute_filter_type" value="single"/>
            <parameter key="attribute" value="%{loop_attribute}"/>
            <parameter key="attributes" value=""/>
            <parameter key="use_except_expression" value="false"/>
            <parameter key="value_type" value="attribute_value"/>
            <parameter key="use_value_type_exception" value="false"/>
            <parameter key="except_value_type" value="time"/>
            <parameter key="block_type" value="attribute_block"/>
            <parameter key="use_block_type_exception" value="false"/>
            <parameter key="except_block_type" value="value_matrix_row_start"/>
            <parameter key="invert_selection" value="false"/>
            <parameter key="include_special_attributes" value="false"/>
            <parameter key="treat_missing_values_as_duplicates" value="false"/>
          </operator>
          <connect from_port="input 1" to_op="Remove Duplicates" to_port="example set input"/>
          <connect from_op="Remove Duplicates" from_port="example set output" to_port="output 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="source_input 2" spacing="0"/>
          <portSpacing port="sink_output 1" spacing="0"/>
          <portSpacing port="sink_output 2" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="operator_toolbox:filter_missing_examples" compatibility="2.0.001" expanded="true" height="82" name="Filter Examples with Missing Values" width="90" x="2256" y="85">
        <parameter key="filter_method" value="one or more non-missing"/>
        <parameter key="maximum_number_of_missings" value="100"/>
        <parameter key="maximum_relative_number_of_missings" value="0.1"/>
        <parameter key="invert_selection" value="false"/>
        <parameter key="include_special_attributes" value="false"/>
      </operator>
      <operator activated="true" class="generate_id" compatibility="9.2.001" expanded="true" height="82" name="Generate ID (4)" width="90" x="2390" y="136">
        <parameter key="create_nominal_ids" value="false"/>
        <parameter key="offset" value="0"/>
      </operator>
      <operator activated="true" class="concurrency:join" compatibility="9.2.001" expanded="true" height="82" name="Join (2)" width="90" x="2524" y="136">
        <parameter key="remove_double_attributes" value="true"/>
        <parameter key="join_type" value="right"/>
        <parameter key="use_id_attribute_as_key" value="true"/>
        <list key="key_attributes"/>
        <parameter key="keep_both_join_attributes" value="false"/>
      </operator>
      <operator activated="true" class="time_series:replace_missing_values" compatibility="9.2.001" expanded="true" height="68" name="Replace Missing Values (Series)" width="90" x="2658" y="136">
        <parameter key="attribute_filter_type" value="single"/>
        <parameter key="attribute" value="metadata_file"/>
        <parameter key="attributes" value=""/>
        <parameter key="use_except_expression" value="false"/>
        <parameter key="value_type" value="nominal"/>
        <parameter key="use_value_type_exception" value="false"/>
        <parameter key="except_value_type" value="time"/>
        <parameter key="block_type" value="single_value"/>
        <parameter key="use_block_type_exception" value="false"/>
        <parameter key="except_block_type" value="value_matrix_row_start"/>
        <parameter key="invert_selection" value="false"/>
        <parameter key="include_special_attributes" value="true"/>
        <parameter key="has_indices" value="false"/>
        <parameter key="indices_attribute" value=""/>
        <parameter key="overwrite_attributes" value="true"/>
        <parameter key="new_attributes_postfix" value="_cleaned"/>
        <parameter key="replace_type_numerical" value="previous value"/>
        <parameter key="replace_type_nominal" value="previous value"/>
        <parameter key="replace_type_date_time" value="previous value"/>
        <parameter key="replace_value_numerical" value="0.0"/>
        <parameter key="replace_value_nominal" value="unknown"/>
        <parameter key="skip_other_missings" value="true"/>
        <parameter key="replace_infinity" value="true"/>
        <parameter key="replace_empty_strings" value="true"/>
        <parameter key="ensure_finite_values" value="false"/>
      </operator>
      <connect from_op="Read Document" from_port="output" to_op="Process Documents" to_port="documents 1"/>
      <connect from_op="Process Documents" from_port="example set" to_op="Multiply" to_port="input"/>
      <connect from_op="Multiply" from_port="output 1" to_op="Transpose" to_port="example set input"/>
      <connect from_op="Multiply" from_port="output 2" to_op="Select Attributes (2)" to_port="example set input"/>
      <connect from_op="Select Attributes (2)" from_port="example set output" to_op="Generate ID (3)" to_port="example set input"/>
      <connect from_op="Generate ID (3)" from_port="example set output" to_op="Join (2)" to_port="left"/>
      <connect from_op="Transpose" from_port="example set output" to_op="Set Role" to_port="example set input"/>
      <connect from_op="Set Role" from_port="example set output" to_op="Rename" to_port="example set input"/>
      <connect from_op="Rename" from_port="example set output" to_op="Generate ID (2)" to_port="example set input"/>
      <connect from_op="Generate ID (2)" from_port="example set output" to_op="Join" to_port="left"/>
      <connect from_op="Create ExampleSet (3)" from_port="output" to_op="Loop Values" to_port="input 1"/>
      <connect from_op="Loop Values" from_port="output 1" to_op="Generate ID" to_port="example set input"/>
      <connect from_op="Generate ID" from_port="example set output" to_op="Join" to_port="right"/>
      <connect from_op="Join" from_port="join" to_op="Loop Attributes" to_port="input 1"/>
      <connect from_op="Loop Attributes" from_port="output 1" to_op="Nominal to Text" to_port="example set input"/>
      <connect from_op="Nominal to Text" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
      <connect from_op="Select Attributes" from_port="example set output" to_op="Loop Attributes (2)" to_port="input 1"/>
      <connect from_op="Loop Attributes (2)" from_port="output 1" to_op="Union Append" to_port="in 1"/>
      <connect from_op="Union Append" from_port="out 1" to_op="Loop Attributes (3)" to_port="input 1"/>
      <connect from_op="Loop Attributes (3)" from_port="output 1" to_op="Filter Examples with Missing Values" to_port="example set"/>
      <connect from_op="Filter Examples with Missing Values" from_port="filtered example set" to_op="Generate ID (4)" to_port="example set input"/>
      <connect from_op="Generate ID (4)" from_port="example set output" to_op="Join (2)" to_port="right"/>
      <connect from_op="Join (2)" from_port="join" to_op="Replace Missing Values (Series)" to_port="example set"/>
      <connect from_op="Replace Missing Values (Series)" from_port="example set" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>
Thanks for your listening,

Regards,

Lionel




Tagged:

Best Answers

  • lionelderkrikorlionelderkrikor Posts: 1,068   Unicorn
    Solution Accepted
    Hi @Telcontar120,

    Thanks for your answer. I'm not sure to understand, can you develop your approach ?

    Regards,

    Lionel
  • tftemmetftemme Posts: 135  RM Research
    edited May 2019 Solution Accepted
    Hi @lionelderkrikor

    I think @Telcontar120 is referring to the fact that the custom filters in Filter Examples cannot handle macros in the attribute field, while Select Attributes can. So you have to circumvent this. I would just use a Rename operator with renaming the %{loop_attribute} to current_attribute (or similar and using this in the Filter Examples. And renaming the current_attribute back to %{loop_attribute}.

    Below an demo process:

    <?xml version="1.0" encoding="UTF-8"?><process version="9.2.001">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="9.2.001" expanded="true" name="Process">
        <parameter key="logverbosity" value="init"/>
        <parameter key="random_seed" value="2001"/>
        <parameter key="send_mail" value="never"/>
        <parameter key="notification_email" value=""/>
        <parameter key="process_duration_for_mail" value="30"/>
        <parameter key="encoding" value="SYSTEM"/>
        <process expanded="true">
          <operator activated="true" class="utility:create_exampleset" compatibility="9.2.001" expanded="true" height="68" name="Create ExampleSet" width="90" x="179" y="34">
            <parameter key="generator_type" value="attribute functions"/>
            <parameter key="number_of_examples" value="6"/>
            <parameter key="use_stepsize" value="false"/>
            <list key="function_descriptions">
              <parameter key="a" value="if(id==1,&quot;a&quot;,&quot;ldjf&quot;)"/>
              <parameter key="b" value="if(id==2,&quot;b&quot;,&quot;fd&quot;)"/>
              <parameter key="c" value="if(id==3,&quot;c&quot;,&quot;j78th&quot;)"/>
            </list>
            <parameter key="add_id_attribute" value="true"/>
            <list key="numeric_series_configuration"/>
            <list key="date_series_configuration"/>
            <list key="date_series_configuration (interval)"/>
            <parameter key="date_format" value="yyyy-MM-dd HH:mm:ss"/>
            <parameter key="time_zone" value="SYSTEM"/>
            <parameter key="column_separator" value=","/>
            <parameter key="parse_all_as_nominal" value="false"/>
            <parameter key="decimal_point_character" value="."/>
            <parameter key="trim_attribute_names" value="true"/>
          </operator>
          <operator activated="true" class="concurrency:loop_attributes" compatibility="9.2.001" expanded="true" height="82" name="Loop Attributes" width="90" x="313" y="34">
            <parameter key="attribute_filter_type" value="all"/>
            <parameter key="attribute" value=""/>
            <parameter key="attributes" value=""/>
            <parameter key="use_except_expression" value="false"/>
            <parameter key="value_type" value="attribute_value"/>
            <parameter key="use_value_type_exception" value="false"/>
            <parameter key="except_value_type" value="time"/>
            <parameter key="block_type" value="attribute_block"/>
            <parameter key="use_block_type_exception" value="false"/>
            <parameter key="except_block_type" value="value_matrix_row_start"/>
            <parameter key="invert_selection" value="false"/>
            <parameter key="include_special_attributes" value="false"/>
            <parameter key="attribute_name_macro" value="loop_attribute"/>
            <parameter key="reuse_results" value="false"/>
            <parameter key="enable_parallel_execution" value="true"/>
            <process expanded="true">
              <operator activated="true" class="rename" compatibility="9.2.001" expanded="true" height="82" name="Rename" width="90" x="112" y="34">
                <parameter key="old_name" value="%{loop_attribute}"/>
                <parameter key="new_name" value="current_attribute"/>
                <list key="rename_additional_attributes"/>
              </operator>
              <operator activated="true" class="filter_examples" compatibility="9.2.001" expanded="true" height="103" name="Filter Examples" width="90" x="313" y="34">
                <parameter key="parameter_expression" value=""/>
                <parameter key="condition_class" value="custom_filters"/>
                <parameter key="invert_filter" value="false"/>
                <list key="filters_list">
                  <parameter key="filters_entry_key" value="current_attribute.contains.%{loop_attribute}"/>
                </list>
                <parameter key="filters_logic_and" value="true"/>
                <parameter key="filters_check_metadata" value="true"/>
              </operator>
              <operator activated="true" class="rename" compatibility="9.2.001" expanded="true" height="82" name="Rename (2)" width="90" x="514" y="34">
                <parameter key="old_name" value="current_attribute"/>
                <parameter key="new_name" value="%{loop_attribute}"/>
                <list key="rename_additional_attributes"/>
              </operator>
              <connect from_port="input 1" to_op="Rename" to_port="example set input"/>
              <connect from_op="Rename" from_port="example set output" to_op="Filter Examples" to_port="example set input"/>
              <connect from_op="Filter Examples" from_port="example set output" to_op="Rename (2)" to_port="example set input"/>
              <connect from_op="Rename (2)" from_port="example set output" to_port="output 1"/>
              <portSpacing port="source_input 1" spacing="0"/>
              <portSpacing port="source_input 2" spacing="0"/>
              <portSpacing port="sink_output 1" spacing="0"/>
              <portSpacing port="sink_output 2" spacing="0"/>
            </process>
          </operator>
          <operator activated="true" class="append" compatibility="9.2.001" expanded="true" height="82" name="Append" width="90" x="447" y="34">
            <parameter key="datamanagement" value="double_array"/>
            <parameter key="data_management" value="auto"/>
            <parameter key="merge_type" value="all"/>
          </operator>
          <connect from_op="Create ExampleSet" from_port="output" to_op="Loop Attributes" to_port="input 1"/>
          <connect from_op="Loop Attributes" from_port="output 1" to_op="Append" to_port="example set 1"/>
          <connect from_op="Append" from_port="merged set" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    


Answers

  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,068   Unicorn
    Hi,

    @Telcontar120, OK , I understood.

    @tftemme, I must admit that your solution is ingenious.

    To conclude, thanks you both :  the combinaison of your suggestions solves this problem.

    Regards,

    Lionel
Sign In or Register to comment.