The Altair Community is migrating to a new platform to provide a better experience for you. The RapidMiner Community will merge with the Altair Community at the same time. In preparation for the migration, both communities are on read-only mode from July 15th - July 24th, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here.
Options

"regex not working"

allerkongeallerkonge Member Posts: 1 Contributor I
edited June 2019 in Help

Dear all,

 

I'm trying to clean a dataset, and I'm working with a couple of regex. If I text the regex on the website regexpal, it works fine, but if I put the same regex to Rapidminer with the Replace Operator, it says there is a mistake. 

 

This is one of the regex I'm testing:

 

(?=\b[\m*#])\w+

 

When I try this one, it says it's uncorrect.

 

I'm correcting this in this way, adding backslashes

 

(?=\\b[\\m*#])\\w+

 

And it doesn't says is uncorrect, but it doesn't replace anything. The attribute is gender, so I'd like to replace for example "mm", or "male" with "M"

 

Thanks a lot for your help.

 

 

Tagged:

Best Answer

  • Options
    IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Solution Accepted

    Hi,

     

    regexpal only works for Javascript but RapidMiner uses the Java regex parser.  Despite the similar names, both languages are actually completely different so there is no guarantee that any Javascript regex would word with Java.  See here for example: http://stackoverflow.com/questions/21883629/java-vs-javascript-regex-matching

     

    Anyway, in your case can't you just use "m.*" (without quotes) in "replace_what" and "M" in "replace_by"?

     

    I am not sure if it needs to be more complicated than that but I don't know your data of course...

     

    Hope this helps,

    Ingo

Sign In or Register to comment.