"regex not working"

allerkongeallerkonge Member Posts: 1 Contributor I
edited June 2019 in Help

Dear all,

 

I'm trying to clean a dataset, and I'm working with a couple of regex. If I text the regex on the website regexpal, it works fine, but if I put the same regex to Rapidminer with the Replace Operator, it says there is a mistake. 

 

This is one of the regex I'm testing:

 

(?=\b[\m*#])\w+

 

When I try this one, it says it's uncorrect.

 

I'm correcting this in this way, adding backslashes

 

(?=\\b[\\m*#])\\w+

 

And it doesn't says is uncorrect, but it doesn't replace anything. The attribute is gender, so I'd like to replace for example "mm", or "male" with "M"

 

Thanks a lot for your help.

 

 

Tagged:

Best Answer

  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Solution Accepted

    Hi,

     

    regexpal only works for Javascript but RapidMiner uses the Java regex parser.  Despite the similar names, both languages are actually completely different so there is no guarantee that any Javascript regex would word with Java.  See here for example: http://stackoverflow.com/questions/21883629/java-vs-javascript-regex-matching

     

    Anyway, in your case can't you just use "m.*" (without quotes) in "replace_what" and "M" in "replace_by"?

     

    I am not sure if it needs to be more complicated than that but I don't know your data of course...

     

    Hope this helps,

    Ingo

Sign In or Register to comment.