[SOLVED] TextAnnotator - Create new label - Finish tagging

CharlieFirpoCharlieFirpo Member Posts: 48 Contributor II
edited November 2018 in Help
Dear all!

A have a sample training exercise: retrieve a formerly created exampleset and manually annotate persons, locations, organizations in the text of the source document. The exampleset has a "word" attribute that contains the words of the source document (using SentenceTokenizer and Wordokenizer).
Now I "Generate Empty Attribute" to store the labels in it (PER, LOC,...). Then I have a "Loop Examples" operator to iterate "Set Data" operator in it. In "Set Data" I use the "label" attribute name and the "O" value. It sets the label attribute to "O" for all examples (rows), so for all words of the original document. After "Lop examples" there is the "TextAnnotator" operator where the text-attribute is "word", the label-attribute is "label". I run the process and see that in the ExampleSet result, all word has the "O" label value. It is OK. Then I switch to TextAnnotator to see its result. I find only one label: "O" with white color, and all the text (of the document) is white backgrounded. It's OK. And there is the option on that page to "Create new Label" and to "Finish tagging". I'm able to create new labels ("PER", "LOC", "ORG"), but how can I set manually which word (of the text) is a person or a location or an organization???

I create eg. the "PER" label writing "PER" to "labelName" field then click on Create new Label button. Then I double click on a string (a name of a person) in the text then I click on Finish tagging. But nothing happes. The selected string still has white background..

Can anybody help me?

Thank you!!!!


  • Options
    CharlieFirpoCharlieFirpo Member Posts: 48 Contributor II
    Ahh, I found...
    After selecting a token (eg. a name of a person), I have to press the "Alt" keyboard button and the bacground of its token will change. And after tagging all tokens, I have to press Finish tagging button and then the values of the label attribute will change.

    Wondering why there is not a "Tag it" button next to "Create new Label" button...
    And also the message is so "meaningful" for a beginner: "Shortcuts: 'Strg' for nex and 'Alt' for previous Label. 'Alt Gr' selects next word."
  • Options
    CharlieFirpoCharlieFirpo Member Posts: 48 Contributor II
    Creating the "PER" label and tag a string with it, then creating a second label ("LOC"), RapidMiner crashes down with this message:
    gui.dialog.error.Error during logging: .title
    gui.dialog.error.Error during logging: .message
    Invalid insert

    Even if I press Finish tagging button after tag a PER token, before creating the LOC label
    Brilliant :)
  • Options
    CharlieFirpoCharlieFirpo Member Posts: 48 Contributor II
    Another bug:
    If I create more label-value pars (label-PER; label-LOC; label-ORG; label-MISC) in the Set Data suboperator (of Loop Examples operator) and the last label is not the whte backgrounded "O", but eg. MISC, all the text background will be (let say) orange. It's OK. But:
    If I change only one token's label from MISC to PER (or LOC or ORG or "O"), the whole sentence's (4-5 tokens) labels will change to PER (or LOC...). But the PER-background color only appears at the selected token not the whole sentence.
    2) When the whole text has orange background color (because the last label-value pair is MISC (or PER, or LOC, or ORG, but not "O"), the spaces between the words are also colored as orange. But sapces are not tokens. Or they are?
Sign In or Register to comment.