The Altair Community is migrating to a new platform to provide a better experience for you. The RapidMiner Community will merge with the Altair Community at the same time. In preparation for the migration, both communities are on read-only mode from July 15th - July 24th, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here.
Options

lower case without removing numbers

m_meijsm_meijs Member Posts: 6 Contributor I
edited November 2018 in Help

Hi all,

 

I am trying to join two tables together. Because they are both open fields, there are some differences in names regarding punctuation, use of capitals etc.
I have now transformed the variables to match on like this to overcome this problem: replaceAll(lower(trim(VARIABLE)), "[. ,()-:?]","").

However, this code also deletes all numbers in the variable name and I don't want that.

 

Any thoughts?

Best Answer

  • Options
    Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn
    Solution Accepted

    If these are nominal values, then I would definately use the Replace operator and do the regex first to remove the puncuation AND then do the lowering/trimming. 

     

    Just doing a regex with \W selects all the puncuation and then replace with either a "_" or nothing will get rid of the puncutation. If that's not the desired effect then I would consider splitting off the numbers, doing the lower/trim, and then rejoin the numbers.

Answers

  • Options
    Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    It sounds like you'll need the Rename By Replacing operator and write some REGEX to skip over the numbers in your attribute names.

  • Options
    sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    yes agreed.  I would also just try changing the order of that expression.  Looks like you're trying to use RegEx after "lower" and "trim".  Try doing the replaceAll RegEx first.


    Scott

  • Options
    m_meijsm_meijs Member Posts: 6 Contributor I

    Thanks. I have tried that, but unfortunately it doesn't work. It removes all characters and spaces, but when converting to lower case it still removes the numbers.

    Any thoughts on what to add to make sure it doesn't do that :)?

  • Options
    m_meijsm_meijs Member Posts: 6 Contributor I

    This works! Thanks!

Sign In or Register to comment.