RapidMiner

Replace single characters without changing strings value of the attribute

SOLVED
Highlighted
Contributor I f_laperna
Contributor I

Replace single characters without changing strings value of the attribute

Hi, I want to replace some values in an attribute of my dataset. In particular, I have some characters (like "C", "P", "A" and some strings, like "SPAIN", "ITALY" etc.).

I want to modify the value A without changing the string SPAIN. For example, by replacing A with "Other" I always obtain SPOtherIN.

I tried with A, with "A", with (A) but without success. Does anyone knows how to achieve that? Thank you!

7 REPLIES
Community Manager Community Manager
Community Manager

Re: Replace single characters without changing strings value of the attribute

hello @f_laperna - have you tried doing this with a RegEx like \sA\s ?

 

Scott

Scott Genzer
Senior Community Manager
RapidMiner, Inc.
Contributor I f_laperna
Contributor I

Re: Replace single characters without changing strings value of the attribute

I tried but the problem is I don’t have a space character before and after A. So it’s not working
Maven
Maven

Re: Replace single characters without changing strings value of the attribute

If your example is representative of your data, you could do the following:

 

1. Generate an attribute "length" with the "Generate Attributes" and length() in the function expression.

2. Multiply your data with the "Multiply" operator

3. Filter both threads that come out of the Multiply output ports. You can use "Filter examples" and the newly created length attribute. As per your example it would be length > 1 and length <= 1

4. Now you can replace the values in the filtered thread with length <= 1 with e.g. some RegEx

5. Glue the two threads back together with the "Append" operator.

 

There is probably a more elegant way of doing the above, but sometimes it helps to break things down into small steps.

Community Manager Community Manager
Community Manager

Re: Replace single characters without changing strings value of the attribute

yes well done @FBT - that will work nicely.  Sorry @f_laperna I did not realize that the content looked like this:

 

A

ALIEN

B

BROKEN

C

CHARLIE

etc...

 

so perhaps the easiest way is to use the Map operator and create a lookup table.  That will only make changes for a true string match - not partial like Replace.  Otherwise you can use @FBT 's idea - even in one Generate Attributes operator like this:

 

att1         if(length(att1)=="A", "foo",att1)

 

or something like that.  It's pretty much what Map does but you can be more specific.

 

Scott

 

Scott Genzer
Senior Community Manager
RapidMiner, Inc.
RM Certified Expert
RM Certified Expert
Solution

Re: Replace single characters without changing strings value of the attribute

Perhaps a more direct solution would be to use the lookaround syntax with regex.  Here's what you want, I think:

(?<!\w)A(?!\w)

This will only take "A" when it is not preceded by another word character before and after (thus it will skip "A" in the middle of a word).

Try it and see if that works.

But kudos to @FBT for a creative solution that would also work, albeit a more complex one.

 

 

Brian T., Lindon Ventures - www.lindonventures.com
Analytics Consulting by Certified RapidMiner Analysts
Contributor I f_laperna
Contributor I

Re: Replace single characters without changing strings value of the attribute

Yes, It worked, and it's exactly the solution I was looking for. Thank you very much!

Community Manager Community Manager
Community Manager

Re: Replace single characters without changing strings value of the attribute

yes well done @Telcontar120.  I always get tangled up with RegEx lookarounds.  Smiley Happy

 

Scott

Scott Genzer
Senior Community Manager
RapidMiner, Inc.