Term Frequencies greater than 1
Dear all,
I use "Text Processing - Process Documents From Files" to calculate word vectors for documents.
As I read here: http://rapid-i.com/rapidforum/index.php?PHPSESSID=0aba344304fbb94614ad24f236d974e4&;topic=3728.0
term frequencies are normalized (as I expected).
For me this means that term frequencies always have values < 1.
In my case I use TF-IDF for vector creation as proposed, and get some term frequencies in the range of 1E+10 or 1E+11.
Looking at the related documents they appear to be "normal".
Any ideas why this happens? What I´m not understanding?
I use "Text Processing - Process Documents From Files" to calculate word vectors for documents.
As I read here: http://rapid-i.com/rapidforum/index.php?PHPSESSID=0aba344304fbb94614ad24f236d974e4&;topic=3728.0
term frequencies are normalized (as I expected).
For me this means that term frequencies always have values < 1.
In my case I use TF-IDF for vector creation as proposed, and get some term frequencies in the range of 1E+10 or 1E+11.
Looking at the related documents they appear to be "normal".
Any ideas why this happens? What I´m not understanding?
0
Answers
Do I think wrong?
Can term frequencies be greater than 1?
Are there circumstances where it is better to use method for vector creation?
Under which circumstances which method for vector creation is most appropriate?
Many thanks in advance for any hint ...
Dortmund, Germany
thanks a lot for your hints ...
I´ll try and see.
BR