Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

Substitute Search result in RegEx with new line doesn't work

HyperrickHyperrick Member Posts: 21 Contributor I
Hello support team,

I might found a bug in the RegEx implementation of Rapidminer.
My goal is to replace a "blank" with a "new line" character.


Searching in following Text:

Sri Lanka

Using the Replace operator:

Search for

( )

Replace with:

\n

leads to:

SrinLanka

I created a workaround using the "split" operator which you also can see in my sample process.

So the question is: Is it possible to create a new line with RegEx in RM?

<?xml version="1.0" encoding="UTF-8"?><process version="9.7.002">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="9.7.002" expanded="true" name="Process">
    <parameter key="logverbosity" value="init"/>
    <parameter key="random_seed" value="2001"/>
    <parameter key="send_mail" value="never"/>
    <parameter key="notification_email" value=""/>
    <parameter key="process_duration_for_mail" value="30"/>
    <parameter key="encoding" value="SYSTEM"/>
    <process expanded="true">
      <operator activated="true" class="text:create_document" compatibility="9.3.001" expanded="true" height="68" name="Create Document" width="90" x="112" y="136">
        <parameter key="text" value="Sri Lanka"/>
        <parameter key="add label" value="false"/>
        <parameter key="label_type" value="nominal"/>
      </operator>
      <operator activated="true" class="text:documents_to_data" compatibility="9.3.001" expanded="true" height="82" name="Documents to Data" width="90" x="246" y="136">
        <parameter key="text_attribute" value="text"/>
        <parameter key="add_meta_information" value="true"/>
        <parameter key="datamanagement" value="double_sparse_array"/>
        <parameter key="data_management" value="auto"/>
        <parameter key="use_processed_text" value="false"/>
      </operator>
      <operator activated="true" class="multiply" compatibility="9.7.002" expanded="true" height="103" name="Multiply" width="90" x="447" y="136"/>
      <operator activated="true" class="split" compatibility="9.7.002" expanded="true" height="82" name="Split" width="90" x="648" y="289">
        <parameter key="attribute_filter_type" value="all"/>
        <parameter key="attribute" value=""/>
        <parameter key="attributes" value=""/>
        <parameter key="use_except_expression" value="false"/>
        <parameter key="value_type" value="nominal"/>
        <parameter key="use_value_type_exception" value="false"/>
        <parameter key="except_value_type" value="file_path"/>
        <parameter key="block_type" value="single_value"/>
        <parameter key="use_block_type_exception" value="false"/>
        <parameter key="except_block_type" value="single_value"/>
        <parameter key="invert_selection" value="false"/>
        <parameter key="include_special_attributes" value="false"/>
        <parameter key="split_pattern" value=" "/>
        <parameter key="split_mode" value="ordered_split"/>
      </operator>
      <operator activated="true" class="transpose" compatibility="9.7.002" expanded="true" height="82" name="Transpose" width="90" x="782" y="289"/>
      <operator activated="true" class="replace" compatibility="9.7.002" expanded="true" height="82" name="Replace" width="90" x="648" y="34">
        <parameter key="attribute_filter_type" value="all"/>
        <parameter key="attribute" value=""/>
        <parameter key="attributes" value=""/>
        <parameter key="use_except_expression" value="false"/>
        <parameter key="value_type" value="nominal"/>
        <parameter key="use_value_type_exception" value="false"/>
        <parameter key="except_value_type" value="file_path"/>
        <parameter key="block_type" value="single_value"/>
        <parameter key="use_block_type_exception" value="false"/>
        <parameter key="except_block_type" value="single_value"/>
        <parameter key="invert_selection" value="false"/>
        <parameter key="include_special_attributes" value="false"/>
        <parameter key="replace_what" value="( )"/>
        <parameter key="replace_by" value="\n"/>
      </operator>
      <connect from_op="Create Document" from_port="output" to_op="Documents to Data" to_port="documents 1"/>
      <connect from_op="Documents to Data" from_port="example set" to_op="Multiply" to_port="input"/>
      <connect from_op="Multiply" from_port="output 1" to_op="Replace" to_port="example set input"/>
      <connect from_op="Multiply" from_port="output 2" to_op="Split" to_port="example set input"/>
      <connect from_op="Split" from_port="example set output" to_op="Transpose" to_port="example set input"/>
      <connect from_op="Transpose" from_port="example set output" to_port="result 2"/>
      <connect from_op="Replace" from_port="example set output" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <portSpacing port="sink_result 3" spacing="0"/>
    </process>
  </operator>
</process>

Kind Regards,

Patrick
Tagged:

Answers

  • jacobcybulskijacobcybulski Member, University Professor Posts: 391 Unicorn
    I am not sure how useful is this as the solution is a hack. Switch to the XML view of your process (you can bring in the XML panel in from View > Show Panel > XML, or you can edit the RMP text file directly). In the XML file find the "\n" string which you have entered in the Replace operator and change it to "&#10;". If you worked in the XML panel, click the tick, switch back to the Process view and run. Here is an example from your process.

    <?xml version="1.0" encoding="UTF-8"?><process version="9.8.000">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="9.8.000" expanded="true" name="Process">
        <parameter key="logverbosity" value="init"/>
        <parameter key="random_seed" value="2001"/>
        <parameter key="send_mail" value="never"/>
        <parameter key="notification_email" value=""/>
        <parameter key="process_duration_for_mail" value="30"/>
        <parameter key="encoding" value="SYSTEM"/>
        <process expanded="true">
          <operator activated="true" class="text:create_document" compatibility="9.3.001" expanded="true" height="68" name="Create Document" width="90" x="45" y="34">
            <parameter key="text" value="Sri Lanka"/>
            <parameter key="add label" value="false"/>
            <parameter key="label_type" value="nominal"/>
          </operator>
          <operator activated="true" class="text:documents_to_data" compatibility="9.3.001" expanded="true" height="82" name="Documents to Data" width="90" x="179" y="34">
            <parameter key="text_attribute" value="text"/>
            <parameter key="add_meta_information" value="true"/>
            <parameter key="datamanagement" value="double_sparse_array"/>
            <parameter key="data_management" value="auto"/>
            <parameter key="use_processed_text" value="false"/>
          </operator>
          <operator activated="true" class="replace" compatibility="9.8.000" expanded="true" height="82" name="Replace" width="90" x="313" y="34">
            <parameter key="attribute_filter_type" value="single"/>
            <parameter key="attribute" value="text"/>
            <parameter key="attributes" value=""/>
            <parameter key="use_except_expression" value="false"/>
            <parameter key="value_type" value="nominal"/>
            <parameter key="use_value_type_exception" value="false"/>
            <parameter key="except_value_type" value="file_path"/>
            <parameter key="block_type" value="single_value"/>
            <parameter key="use_block_type_exception" value="false"/>
            <parameter key="except_block_type" value="single_value"/>
            <parameter key="invert_selection" value="false"/>
            <parameter key="include_special_attributes" value="false"/>
            <parameter key="replace_what" value="( )"/>
            <parameter key="replace_by" value="&#10;"/>
          </operator>
          <connect from_op="Create Document" from_port="output" to_op="Documents to Data" to_port="documents 1"/>
          <connect from_op="Documents to Data" from_port="example set" to_op="Replace" to_port="example set input"/>
          <connect from_op="Replace" from_port="example set output" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>




  • HyperrickHyperrick Member Posts: 21 Contributor I
    Hi Jacob, thanks for your approach. I tried to do this but after editing the XML in the panel view, the Replace input field stays "\n".



    Kind regards,
    Patrick
  • jacobcybulskijacobcybulski Member, University Professor Posts: 391 Unicorn
    Once your change your XML, press the green tick in the left-upped corner of the XML panel. Only then your change will be entered into the process.
Sign In or Register to comment.