Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

replace hyphen

pb42pb42 Member Posts: 16 Contributor I
I am trying to replace a hyphen from a Grade attribute by using the Replace operator. I would like to replace it with text that describes no value has been entered (i.e., Not indicated). The problem is that the attribute includes values such as - (the hyphen I want to replace), A-, B-, C-. Using the replace operator replaces all of the hyphens (including those being used as minuses). I tried using the regular expression, \b[-]\b, but that is not working. I also tried, \b["-"]\b without success.

Best Answer

Answers

  • [Deleted User][Deleted User] Posts: 0 Newbie
    @pb42

    Hello

    This is very similar with your question ;) Take a look on that please :)

    https://community.rapidminer.com/discussion/comment/63840#Comment_63840

    I hope this helps
    mbs
  • pb42pb42 Member Posts: 16 Contributor I
    Thank you for the direction. I did read this question, but the solution did not make sense to me.
  • varunm1varunm1 Member Posts: 1,207 Unicorn
    Hello @pb42

    Can you provide some sample data?
    Regards,
    Varun
    https://www.varunmandalapu.com/

    Be Safe. Follow precautions and Maintain Social Distancing

  • pb42pb42 Member Posts: 16 Contributor I
    This is the file
  • sgnarkhede2016sgnarkhede2016 Member Posts: 152 Contributor II
    but in replace operator i need to pass "regex" it not working for me 
    e.g
    Sachin N
    Jonn Clara

    I have passed "replace what"  \^(\w+ \w+)
                             "replace by"   \("\w+ \w+")

    I want above string as "Sachin N" and "John Clara"
  • Edin_KlapicEdin_Klapic Employee, RMResearcher, Member Posts: 299 RM Data Scientist
    If I understood you correctly you want to have the entries in the Attributes completed by leading and trailing double quotes. Value => "Value"
    In this case you replace:
    ^(.+)$
    by
    "$1"
    Happy Mining,
    Edin

    P.S.:
    The Operator Generate Attributes could have also been used. The expression would have been:
    "\"" + AttributeName + "\""
    where AttributeName would be the name of the Attribute which values you want to change.
  • sara20sara20 Member Posts: 110 Unicorn
    edited May 2020
    @Edin_Klapic

    Hello

    I work on a data for a store and I want to analyze the basket of customers, for the name of  columns I have alot of symbols and RM is not able to understand them also I can not replace all of them because they are in different types. Could you please tell me how can I solve it?

    Also I think it can be useful if RM team can solve this problem for the next version of RM( Future request)

    Thank you in  advance
    sara
  • Edin_KlapicEdin_Klapic Employee, RMResearcher, Member Posts: 299 RM Data Scientist
    Hi @sara20 ,

    Although your problem is somewhat similar to the abovementioned "hyphen"-issue it affects Names of Attributes and not Attribute values.
    Thus, I suggest for the future that you rather open a new thread in case the answers in a thread don't provide the help you need. That also makes it easier to find for users which might have a similar problem in the future.

    You can use "Rename by Replacing" to replace certain patterns represented by Regular Expressions. But only 1 at a time.
    So, unfortunately, the solution to your problem is not yet (as of version 9.6) a single Operator solution. Please find attached a quick solution using "Rename by Replacing" in loops together with some self created dictionary with which you are hopefully able to achieve your desired goal.

    Happy Mining,
    Edin

    <?xml version="1.0" encoding="UTF-8"?><process version="9.5.001">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="9.5.001" expanded="true" name="Process">
        <parameter key="logverbosity" value="init"/>
        <parameter key="random_seed" value="2001"/>
        <parameter key="send_mail" value="never"/>
        <parameter key="notification_email" value=""/>
        <parameter key="process_duration_for_mail" value="30"/>
        <parameter key="encoding" value="SYSTEM"/>
        <process expanded="true">
          <operator activated="true" class="retrieve" compatibility="9.5.001" expanded="true" height="68" name="Retrieve Golf" width="90" x="179" y="34">
            <parameter key="repository_entry" value="//Samples/data/Golf"/>
          </operator>
          <operator activated="true" class="concurrency:loop_attributes" compatibility="9.5.001" expanded="true" height="82" name="Loop Attributes" width="90" x="313" y="34">
            <parameter key="attribute_filter_type" value="all"/>
            <parameter key="attribute" value=""/>
            <parameter key="attributes" value=""/>
            <parameter key="use_except_expression" value="false"/>
            <parameter key="value_type" value="attribute_value"/>
            <parameter key="use_value_type_exception" value="false"/>
            <parameter key="except_value_type" value="time"/>
            <parameter key="block_type" value="attribute_block"/>
            <parameter key="use_block_type_exception" value="false"/>
            <parameter key="except_block_type" value="value_matrix_row_start"/>
            <parameter key="invert_selection" value="false"/>
            <parameter key="include_special_attributes" value="false"/>
            <parameter key="attribute_name_macro" value="loop_attribute"/>
            <parameter key="reuse_results" value="true"/>
            <parameter key="enable_parallel_execution" value="true"/>
            <process expanded="true">
              <operator activated="true" class="utility:create_exampleset" compatibility="9.5.001" expanded="true" height="68" name="Create ExampleSet" width="90" x="112" y="85">
                <parameter key="generator_type" value="comma separated text"/>
                <parameter key="number_of_examples" value="100"/>
                <parameter key="use_stepsize" value="false"/>
                <list key="function_descriptions"/>
                <parameter key="add_id_attribute" value="false"/>
                <list key="numeric_series_configuration"/>
                <list key="date_series_configuration"/>
                <list key="date_series_configuration (interval)"/>
                <parameter key="date_format" value="yyyy-MM-dd HH:mm:ss"/>
                <parameter key="time_zone" value="SYSTEM"/>
                <parameter key="input_csv_text" value="old,new&#10;o,-&#10;i,%"/>
                <parameter key="column_separator" value=","/>
                <parameter key="parse_all_as_nominal" value="true"/>
                <parameter key="decimal_point_character" value="."/>
                <parameter key="trim_attribute_names" value="true"/>
              </operator>
              <operator activated="true" class="extract_macro" compatibility="9.5.001" expanded="true" height="68" name="Extract Macro (4)" width="90" x="246" y="85">
                <parameter key="macro" value="number_of_examples"/>
                <parameter key="macro_type" value="number_of_examples"/>
                <parameter key="statistics" value="average"/>
                <parameter key="attribute_name" value=""/>
                <list key="additional_macros"/>
              </operator>
              <operator activated="true" class="concurrency:loop" compatibility="9.5.001" expanded="true" height="103" name="Loop (2)" width="90" x="380" y="187">
                <parameter key="number_of_iterations" value="%{number_of_examples}"/>
                <parameter key="iteration_macro" value="iteration"/>
                <parameter key="reuse_results" value="true"/>
                <parameter key="enable_parallel_execution" value="false"/>
                <process expanded="true">
                  <operator activated="true" class="extract_macro" compatibility="9.5.001" expanded="true" height="68" name="Extract Macro (5)" width="90" x="112" y="34">
                    <parameter key="macro" value="old_character"/>
                    <parameter key="macro_type" value="data_value"/>
                    <parameter key="statistics" value="average"/>
                    <parameter key="attribute_name" value="old"/>
                    <parameter key="example_index" value="%{iteration}"/>
                    <list key="additional_macros"/>
                  </operator>
                  <operator activated="true" class="extract_macro" compatibility="9.5.001" expanded="true" height="68" name="Extract Macro (6)" width="90" x="246" y="34">
                    <parameter key="macro" value="new_character"/>
                    <parameter key="macro_type" value="data_value"/>
                    <parameter key="statistics" value="average"/>
                    <parameter key="attribute_name" value="new"/>
                    <parameter key="example_index" value="%{iteration}"/>
                    <list key="additional_macros"/>
                  </operator>
                  <operator activated="true" class="delay" compatibility="9.5.001" expanded="true" height="103" name="only to ensure execution order (2)" width="90" x="447" y="85">
                    <parameter key="delay" value="none"/>
                    <parameter key="delay_amount" value="1000"/>
                    <parameter key="min_delay_amount" value="0"/>
                    <parameter key="max_delay_amount" value="1000"/>
                  </operator>
                  <operator activated="true" class="rename_by_replacing" compatibility="9.5.001" expanded="true" height="82" name="Rename by Replacing (2)" width="90" x="581" y="136">
                    <parameter key="attribute_filter_type" value="all"/>
                    <parameter key="attribute" value=""/>
                    <parameter key="attributes" value=""/>
                    <parameter key="use_except_expression" value="false"/>
                    <parameter key="value_type" value="attribute_value"/>
                    <parameter key="use_value_type_exception" value="false"/>
                    <parameter key="except_value_type" value="time"/>
                    <parameter key="block_type" value="attribute_block"/>
                    <parameter key="use_block_type_exception" value="false"/>
                    <parameter key="except_block_type" value="value_matrix_row_start"/>
                    <parameter key="invert_selection" value="false"/>
                    <parameter key="include_special_attributes" value="false"/>
                    <parameter key="replace_what" value="%{old_character}"/>
                    <parameter key="replace_by" value="%{new_character}"/>
                  </operator>
                  <connect from_port="input 1" to_op="Extract Macro (5)" to_port="example set"/>
                  <connect from_port="input 2" to_op="only to ensure execution order (2)" to_port="through 2"/>
                  <connect from_op="Extract Macro (5)" from_port="example set" to_op="Extract Macro (6)" to_port="example set"/>
                  <connect from_op="Extract Macro (6)" from_port="example set" to_op="only to ensure execution order (2)" to_port="through 1"/>
                  <connect from_op="only to ensure execution order (2)" from_port="through 1" to_port="output 1"/>
                  <connect from_op="only to ensure execution order (2)" from_port="through 2" to_op="Rename by Replacing (2)" to_port="example set input"/>
                  <connect from_op="Rename by Replacing (2)" from_port="example set output" to_port="output 2"/>
                  <portSpacing port="source_input 1" spacing="0"/>
                  <portSpacing port="source_input 2" spacing="0"/>
                  <portSpacing port="source_input 3" spacing="0"/>
                  <portSpacing port="sink_output 1" spacing="0"/>
                  <portSpacing port="sink_output 2" spacing="0"/>
                  <portSpacing port="sink_output 3" spacing="0"/>
                </process>
              </operator>
              <connect from_port="input 1" to_op="Loop (2)" to_port="input 2"/>
              <connect from_op="Create ExampleSet" from_port="output" to_op="Extract Macro (4)" to_port="example set"/>
              <connect from_op="Extract Macro (4)" from_port="example set" to_op="Loop (2)" to_port="input 1"/>
              <connect from_op="Loop (2)" from_port="output 2" to_port="output 1"/>
              <portSpacing port="source_input 1" spacing="147"/>
              <portSpacing port="source_input 2" spacing="0"/>
              <portSpacing port="sink_output 1" spacing="0"/>
              <portSpacing port="sink_output 2" spacing="0"/>
            </process>
          </operator>
          <connect from_op="Retrieve Golf" from_port="output" to_op="Loop Attributes" to_port="input 1"/>
          <connect from_op="Loop Attributes" from_port="output 1" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>



  • sara20sara20 Member Posts: 110 Unicorn
    Thank you very much

Sign In or Register to comment.