Options

Text Mining - Name Collision with special and regular attributes

text_minertext_miner Member Posts: 11 Contributor II
edited June 2019 in Help
Hi,

Since RapidMiner requires all attribute names to be unique, I've noticed a potential naming conflict when doing text mining.  If a special attribute with name X exists, then a regular attribute with the same name cannot also exist (or the regular attribute gets removed when the special attribute is created).  For example, the special attributes "id" and "label" are relatively common terms that may also appear in text documents.

Is there anyway to specify a prefix/postfix for all special attributes (e.g., metadata_ or specattr_) so name collisions are less likely to occur?  If not, could something be added to the configuration options or on the root Process node to allow for this functionality? 

Thanks!

Answers

  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    the Document processing operators will take care that no attribute name is used twice. If words like label or id occur, they will be assigned attributes names label_0 (or label_1 if label_0 already exists). This is remembered in the word list so that the attribtues are named equally during application.

    Greetings,
      Sebastian
Sign In or Register to comment.