Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

"whitespace regular expression"

student24student24 Member Posts: 7 Contributor II
edited June 2019 in Help
Hello everybody,

I want to search words from documents. I use the operator Filter Tokens by content with regular expression. If I want to search more than one word I use word1|word2|...|wordn. Now my question is how can I search an expression where there is a whitespace? For example "Research and Development|Word2|Word3 etc. ". Is there any wildcard for whitespaces?

Thanks for your help
Tagged:

Answers

  • RalfKlinkenbergRalfKlinkenberg Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, RMResearcher, Member, Unconfirmed, University Professor Posts: 68 RM Founder
    You can use
    • [tt]\s[/tt]  as a placeholder for a whitespace character,
    • [tt]\s+[/tt]  for one or more whitespace characters, and
    • [tt]\s*[/tt]  for zero, one, or more whitespace characters.
    • [tt]\t[/tt]  is a placeholder for tabulator symbols.
    • [tt].[/tt]  stands for an arbitrary character.
    RapidMiner regular expressions use the Java syntax for regular expressions. If you search for "[tt]Java regular expressions[/tt]" with Google or another search engine, you will find a lot of documentation.

    Example: "[tt]Research\sand\sDevelopment[/tt]" for "Research and Development".

    Best wishes,
    Ralf
Sign In or Register to comment.