Options

replace carriage returns

HKIHKI Member Posts: 9 Contributor II
edited December 2018 in Help
 
 Hello, I have a data that I retrieve in an excel cell, this one is found in an attribute after reading the file.

in these cells I find carriage returns with other info.

I tried to do a replace all with my attribute, \n,but no success.

exemple :

Text1(\n)

Text1

Result i want :

Text1 

 

this is my replaceAll in a generate attribute

replace([Txt], "\n" , "")



 






i hope it's clear.

thank's for helping


Best Answer

  • Options
    HKIHKI Member Posts: 9 Contributor II
    Solution Accepted

    thank's verry much, it's work better !

     

Answers

  • Options
    kypexinkypexin Moderator, RapidMiner Certified Analyst, Member Posts: 291 Unicorn

    Have you tried \r or \r\n instead? (I can't check myself right now, but it could help maybe...)

  • Options
    sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    hello @HKI - so carriage returns are funny.  They can be represented by a variety of things in unicode.  Have you tried using \r instead of \n?  In order to debug this, I would recommend encoding the attribute (use the "Encode URL" even though it's not a URL) and see how those carriage returns are coded.  Works every time.  :)


    Scott

     

  • Options
    HKIHKI Member Posts: 9 Contributor II

    thank's all for your answering.

    but sorry i haven't see the solution.

    if we take the problem differently. I want to retrieve the first row of my cell.

    for exemple in the generate attribute opertor if i could use a substr () function from position 1 to a carriage return, if the Substr() function exist :-)

    so my exemple will like this :

    Original cell :

    Text1

    Text2

     

    Result :

    substr(cell, 1, carriage return) ==> Text1 (only)

     
     thank's for helping :-)



     






     


  • Options
    sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    hello @HKI - so the best way for us to help you is for you to post both the excel sheet you're trying to parse, and the RapidMiner process you have built so far.  Please post the process in XML format using the </> button.  Thanks.


    Scott

     

  • Options
    HKIHKI Member Posts: 9 Contributor II

    hi,

    her it is my process, and my excel file

    <?xml version="1.0" encoding="UTF-8"?><process version="7.5.001">
    <operator activated="true" class="read_excel" compatibility="7.5.001" expanded="true" height="68" name="Read Excel" width="90" x="45" y="34">
    <parameter key="excel_file" value="C:\Users\phki12731\Desktop\tst RapidMiner.xlsx"/>
    <parameter key="sheet_number" value="1"/>
    <parameter key="imported_cell_range" value="A1:B4"/>
    <parameter key="encoding" value="SYSTEM"/>
    <parameter key="first_row_as_names" value="false"/>
    <list key="annotations">
    <parameter key="0" value="Name"/>
    </list>
    <parameter key="date_format" value=""/>
    <parameter key="time_zone" value="SYSTEM"/>
    <parameter key="locale" value="English (United States)"/>
    <list key="data_set_meta_data_information">
    <parameter key="0" value="Nom.true.polynominal.attribute"/>
    <parameter key="1" value="Prénom.true.polynominal.attribute"/>
    </list>
    <parameter key="read_not_matching_values_as_missings" value="true"/>
    <parameter key="datamanagement" value="double_array"/>
    <parameter key="data_management" value="auto"/>
    </operator>
    </process>
    <?xml version="1.0" encoding="UTF-8"?><process version="7.5.001">
    <operator activated="true" class="generate_attributes" compatibility="7.5.001" expanded="true" height="82" name="Generate Attributes" width="90" x="179" y="34">
    <list key="function_descriptions">
    <parameter key="Nom" value="upper([Nom])"/>
    <parameter key="Prénom" value="replace([Prénom], &quot;\\n&quot; , &quot;*&quot;)"/>
    </list>
    <parameter key="keep_all" value="true"/>
    </operator>
    </process>
  • Options
    sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    hello @HKI ok thank you.  That XML is corrupted - it looks like pieces of two processes.  And you are using RM 7.5.1?  That's a pretty old version. I would highly recommend downloading Studio 7.6.1 (most recent version) by logging into my.rapidminer.com.


    Scott

     

  • Options
    HKIHKI Member Posts: 9 Contributor II

    exactly, i have the 7.5 version. i'm trying to donload the right version right know :-)

     in the meantime, here is a preview of my process in screen copy.

    thank you for helpin.

     

     



     


  • Options
    sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    ok.  Let's wait until you upgrade to 7.6.1 and load the process there.  That will be much easier.  And I don't see the actual Excel file - just images.

     

    Scott

     

  • Options
    HKIHKI Member Posts: 9 Contributor II
     
    Hi,

     I am waiting for installation validation of this version. I am not admin of my post. we have dedicated teams to do it :-)

     



     






     


  • Options
    sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    oh dear.  OK.  Holding pattern confirmed.  :)

  • Options
    Edin_KlapicEdin_Klapic Moderator, Employee, RMResearcher, Member Posts: 299 RM Data Scientist

    Hi @HKI,

     

    the following expression in "Generate Attributes" solves this problem on my side:

     

    replaceAll(old_attribute,"\\n.*","")

     

    Explanation:

    • "replaceAll" can deal with regular expressions in the middle (i.e. selection part) - click on the double arrows to the right of "Text transformations" in the left (i.e. Functions) panel of "Generate Attributes"
    • The regex for newline "\n" needs to be escaped by a leading backslash since a backslash is internally used to escape the following character.

    Hope this helps,

    Edin

Sign In or Register to comment.