Options

# TFIDF Mean

I run TFIDF on some text, four files.

1) alpha bravo

2) alpha bravo

3) alpha bravo charlie delta

4) alpha bravo charlie delta

How is the "statistic" field calculated in the Meta data view output here? Is the mean here the calculation the td/idf measure (f[ij] / f[dj] * log( D / f )?

When I run it on "charlie" from above, RapidMiner gives 0.354. When I run the calculation by hand 1/4 * log( 4 / 2 ) I get 0.075. Is this normalized somehow or is the log the natural log or base 2?

Thank you for any input.

mj

1) alpha bravo

2) alpha bravo

3) alpha bravo charlie delta

4) alpha bravo charlie delta

How is the "statistic" field calculated in the Meta data view output here? Is the mean here the calculation the td/idf measure (f[ij] / f[dj] * log( D / f )?

When I run it on "charlie" from above, RapidMiner gives 0.354. When I run the calculation by hand 1/4 * log( 4 / 2 ) I get 0.075. Is this normalized somehow or is the log the natural log or base 2?

Thank you for any input.

mj

0

## Answers

2,531Unicornas I already explained in another topic, the mean is simply the statistical mean of all values in this attribute. Please take a look in the other topic for more information.

Greetings,

Sebastian