"new user - help generate text frequency"

AnnaBAnnaB Member Posts: 1 Contributor I
edited May 2019 in Help
All - I have viewed the youtube videos pertaining to text analytics, but i am struggling to generate a simple text frequency.  I can view the parsed the lower case and stopword cleaned text, but not as a table displaying the frequency of occurances of the terms in the document.  I've tried using create document, cutting and pasting a 1,000 word text sentence series, as well as using read document to connect to a series of similar series of sentences delimited by CR/LR.  What am I missing or how can I get help getting started analyzing my text - I must be missing something simple!

read document -> tokenize -> transform case -> filter stopwords --> why can't i see a frequency of words within my document?

Thanks in advance for any help you may be able to offer to get me started!

Answers

  • colocolo Member Posts: 236 Maven
    Hi AnnaB,

    you have to add a "Process Documents" operator after "Read Document" (or use "Process Documents from Files" instead). Then you should place the preprocessing (tokenize, transform case, filter stopwords) inside the "Process Documents" operator (simply double-click it to get the inner process of the operator). "Process documents" allows to select the desired method for vector creation (in your case term frequency).

    Regards
    Matthias
Sign In or Register to comment.