Options

prefixing attribute names produced by WVT?

kirkekirke Member Posts: 4 Contributor I
edited November 2018 in Help
Hi,

as I work with a text collection containing many, many words it contains also words like "label" and "id" already used for attribute names in an example set. I am getting warnings like the one below from the TextInput and I wonder whether there is an easy way to prefix all attribute names originating from words (similar to a StringToWordVector option -P known from Weka).

[Warning] TextInput: The original example example set already contains an attribute named "label".
This is likely to cause trouble. Please rename the attribute in the original example set.
Right now I don't believe it causes much trouble, but maybe I just missed some option in WVT TextInput to fix it.

Thank you very much!
/kirke

Answers

  • Options
    sorassoras Member Posts: 3 Contributor I
    Hi,

    One way for prefixing is to add "TokenReplace" operator as a child of the "TextInput" and define replacement by regular expressions. For example, if your words will only consist of letters from a to z, you can define "([A-Za-z]+)" as a word pattern and "word_$1" as a replacement, where "word_" is the prefix.
    I hope it helps a bit.

  • Options
    kirkekirke Member Posts: 4 Contributor I
    This worked fine, thank you!
Sign In or Register to comment.