Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

Trim Not Working

edinsda2edinsda2 Member Posts: 5 Contributor I
edited December 2018 in Help

Hi All,

 

I am trying to use the trim operator to remove a space at the start of my attribute values

 

But it doesn't seem to be working, I am using v7.5

 

<?xml version="1.0" encoding="UTF-8"?><process version="7.5.001">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="7.5.001" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="7.5.001" expanded="true" height="68" name="Retrieve CountryAndGPName" width="90" x="246" y="85">
<parameter key="repository_entry" value="../Data/CountryAndGPName"/>
</operator>
<operator activated="true" breakpoints="before,after" class="trim" compatibility="7.5.001" expanded="true" height="82" name="Trim" width="90" x="447" y="85">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="Country"/>
</operator>
<connect from_op="Retrieve CountryAndGPName" from_port="output" to_op="Trim" to_port="example set input"/>
<connect from_op="Trim" from_port="example set output" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>

 

 

Answers

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    I'm not on my regular machine so I can't import the XML but one note of caution. Trim only works with a polynominal data type. If you have spaces with numbers, then I'd suggest converting them to polynominals, then applying Trim, and then converting back to numericals.

     

  • FBTFBT Member Posts: 106 Unicorn

    Hi,

     

    it looks like the whitespaces in front of your data points are not real whitespaces. When importing it with UTF-8, I get this weird symbol, indicating that there is some kind of character that is not recognizable. Unless you know exactly what this character is, I think the simplest way would be to use the "Replace" operator with some Regex function. See, if the one below works for you:

     

    <?xml version="1.0" encoding="UTF-8"?><process version="7.6.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="7.6.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" breakpoints="before" class="replace" compatibility="7.6.001" expanded="true" height="82" name="Replace" width="90" x="313" y="136">
    <parameter key="replace_what" value="[^\u0000-\u007F]+"/>
    </operator>
    <connect from_op="Replace" from_port="example set output" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>
  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    haha @FBT I was working on the same thing at the same time.  It's a &nbsp character (unicode %C2%A0).  Trim will not take care of this but this will do the trick.

     

    <?xml version="1.0" encoding="UTF-8"?><process version="7.6.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="7.6.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="retrieve" compatibility="7.6.001" expanded="true" height="68" name="Retrieve CountryAndGPName (2)" width="90" x="246" y="85">
    <parameter key="repository_entry" value="//Google Drive/RapidMiner/CountryAndGPName"/>
    </operator>
    <operator activated="true" class="web:encode_urls" compatibility="7.3.000" expanded="true" height="82" name="Encode URLs" width="90" x="380" y="85">
    <parameter key="url_attribute" value="Country"/>
    </operator>
    <operator activated="true" class="replace" compatibility="7.6.001" expanded="true" height="82" name="Replace" width="90" x="514" y="85">
    <parameter key="attribute_filter_type" value="single"/>
    <parameter key="attribute" value="Country"/>
    <parameter key="replace_what" value="%C2%A0"/>
    </operator>
    <connect from_op="Retrieve CountryAndGPName (2)" from_port="output" to_op="Encode URLs" to_port="example set input"/>
    <connect from_op="Encode URLs" from_port="example set output" to_op="Replace" to_port="example set input"/>
    <connect from_op="Replace" from_port="example set output" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>

    Scott

Sign In or Register to comment.