IDF Calculation for Test Set

smjsmj1
smjsmj1 New Altair Community Member
edited November 2024 in Community Q&A
Can anyone explain the calculation of IDF value for Test sets?
Is it based on the IDF of Training sets?
I see that test set take only the word list used by the training set and IDF is Calculated solely based on the test set. So, if Test set contain only 1 document, then there is a chance that IDF becomes 0, correct?
Tagged:

Answers

  • fras
    fras New Altair Community Member
    If you are using TF-IDF you must store model _and_ wordlist after training.
    To test or score unseen data you have to preprocess with exactly the same
    "Process Documents"-Operator that you used for training including the wordlist.
  • smjsmj1
    smjsmj1 New Altair Community Member
    Thank you for the reply

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.