Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

lower case without removing numbers

m_meijsm_meijs Member Posts: 6 Contributor II
edited November 2018 in Help

Hi all,

 

I am trying to join two tables together. Because they are both open fields, there are some differences in names regarding punctuation, use of capitals etc.
I have now transformed the variables to match on like this to overcome this problem: replaceAll(lower(trim(VARIABLE)), "[. ,()-:?]","").

However, this code also deletes all numbers in the variable name and I don't want that.

 

Any thoughts?

Best Answer

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn
    Solution Accepted

    If these are nominal values, then I would definately use the Replace operator and do the regex first to remove the puncuation AND then do the lowering/trimming. 

     

    Just doing a regex with \W selects all the puncuation and then replace with either a "_" or nothing will get rid of the puncutation. If that's not the desired effect then I would consider splitting off the numbers, doing the lower/trim, and then rejoin the numbers.

Answers

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    It sounds like you'll need the Rename By Replacing operator and write some REGEX to skip over the numbers in your attribute names.

  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    yes agreed.  I would also just try changing the order of that expression.  Looks like you're trying to use RegEx after "lower" and "trim".  Try doing the replaceAll RegEx first.


    Scott

  • m_meijsm_meijs Member Posts: 6 Contributor II

    Thanks. I have tried that, but unfortunately it doesn't work. It removes all characters and spaces, but when converting to lower case it still removes the numbers.

    Any thoughts on what to add to make sure it doesn't do that :)?

  • m_meijsm_meijs Member Posts: 6 Contributor II

    This works! Thanks!

Sign In or Register to comment.