Options

Rename Attributes with Dictionary

t_liebet_liebe Member Posts: 14 Contributor I
edited January 2019 in Help

Hello,

after text processing a document, I want to rename the single words (attributes). Is it possible to connect this with a dictionary? Otherwise I have to do it seperatly for every single Attribute.

A renaming before the text processing is not possible, since it is 3-letter-codes, that could be included in other words as well. In this case, the attributes arc, sas and afl should be replaced:

arcsasbethellohousedoorsuperafl
10000100
01100000

Thank you for your help.

Answers

  • Options
    BalazsBaranyBalazsBarany Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 955 Unicorn
    Hi!

    Do a Loop Examples with the dictionary contents. Extract the code and the replacement as macros from the dictionary. Rename the attributes in the loop, using the macro values.

    Regards,
    Balázs
  • Options
    Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    You can also do this inside your Process Documents using Replace Tokens operator.  This is probably easier than the other approach with macros.

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • Options
    t_liebet_liebe Member Posts: 14 Contributor I

    Thank you for your replies.

    @Telcontar120 The Problem is that other words might include for the letters "arc" and if I replace them, it would have an effect an the words. Or does the the replacement only effect the Tokens after the extraction ?

  • Options
    Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    Replace Tokens only affects tokens, you should be able to try it and confirm this.
    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    Replace Tokens actually does replace using regular expressions within the tokens. So you need to make your regular expressions include the beginning and end of the string. They are represented by \A and a \Z.
    So if you put your regex \Aarc\Z it will only match tokens that are "arc" and will not match "extrarction".
    Still...Doing that is a laaaarge computation overhead in comparison to renaming columns. So if it is at all something with large data sets, this is the wrong way and I would recommend Balazs renaming approach.

    Greetings,
    Sebastian
Sign In or Register to comment.