Term Frequency-Inverse Document Frequency

TF-IDF

Pre-Processing

TF-IDF is a way of vectorizing the contents of a corpus. The TF stands for term frequency, which is the amount of time a word appears in a document. IDF is inverse document frequency, which gives greater value to words that appear less often in the corpus.


Related