RapidMiner 9.7 is Now Available
Lots of amazing new improvements including true version control! Learn more about what's new here.
Metadata view - "Statistics" and "Range" attributes
when I process documents from files with rapidminer,
and I get the results about the tf-idf number of each term of the documents,
I wish to get the terms sorted according to their tf-idf weight (in descending order).
For this reason I switch to the "metadata view" in the "Example Set" tab, and I distinguish two ways to sort them:
One according to the "Statistics" attribute (4th column normally) or according to the "Range" attribute (5th column normally).
one example value of a term in the "Statistics" column looks like this: avg = 0.049 +/-0.084
and one of the "Range" column would look like this: [0.000 ; 0.290]
When I roll over the titles of these columns I get a really sort and incomprehensible message about what they are supposed to show.
My question is pretty simple: Can anybody explain in a simple way, how these numbers, presented in these two columns, are calculated
and what they express?
I have no idea of statistics and maybe I'm looking in the wrong place for the kind of sorting I need. If someone can make things more clear, it will be really helpfull.
thank you for your time.