Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

Replacing whole words with dictionnary

EL75EL75 Member Posts: 43 Contributor II
Hi Rapid miner community,
I don't find the solution to replace whole words after a "read excel" operator. If I use a "Replace (dictionary)" operator linked with an excel file, words are partially substituted - as they are not tokenized - and sometimes part of the word is substituted and aggregated with the rest of the word. for instance, if in my dictionary I have many entries for the misspelling form of the word « application »  (e.g app, apple, etc.) the result can be « applicationlicationncation » ... The reason is that, in my data set, I have many terms misspelled therefore I'd like to use such process to substitute the common misspelling forms. 
Inside the « text processing »  operator, after tokenization I could do it, but there’s no operator to handle this (as far I’ve seen). the « replace token could do the job, but I have to enter one by one all the entries that  I presently have in my misspelling dictionary..
thanks for your help !
Tagged:

Best Answer

Answers

  • kaymankayman Member Posts: 662 Unicorn
    use regex wordboundaries. For instance \bapp\b will only match words that are exactly app, when it is in the middle, end or beginning of a sentence. 
  • EL75EL75 Member Posts: 43 Contributor II
    edited November 2020
    Thanks Kayman, for your response, i've tried it, I duplicated my excel sheet - see file enclosed - but it the operator REPLACE considers \b as part of the words and not as a REGEX.. so that the operator just don't find the word and replace nothing. And as I have many misspelling ways for "application" 

    best regards
Sign In or Register to comment.