Options

# "k-means and its centroïde table values SOLVED"

John_Davis
Member Posts:

**9**Contributor II
Hi,

The k-means operator in Rapid-Minder gives us a centroïde table values in which each cluters contains items and corresponding values . What are these values: tf-idf, Chi2, information rate,...?

Yours

John Davis

The k-means operator in Rapid-Minder gives us a centroïde table values in which each cluters contains items and corresponding values . What are these values: tf-idf, Chi2, information rate,...?

Yours

John Davis

Tagged:

0

## Answers

1,869Unicornthat are probably columns that have been present in your data.

k-Means defines clusters by their central data point, i.e. the average of all elements in the cluster. These so called centroids are defined by the centroid table, where each column contains the attribute values of a centroid.

Best regards,

Marius

9Contributor III think I was not so clear in my first post.

I understand that when using k-means operator, one can have a look through the example set at each cluster's centroïd. (i.e. the attribute values of each cluster's centroïd). My question is about the values that are given in the k-means spreed sheets. For example, when applying k-means on textual data (k=3 clusters), on could end up with a k-means spreed sheet like:

ATTRIBUTE cluster_ 1 cluster_ 2 cluster_ 3

word x 0.2 0.01 0.2

word y 0,4 0,3 0.01

word z 0 0.03 0.002

What are the values fo each column

Yours

John

1,869Unicornyou mean how to interpret the values or the meaning of them? They are the normalized TD-IDF values of the centroids. The TF-IDF values are created by the process documents operator and you will find plenty of information if you google for TF-IDF. Basically it is a kind of smart counting of words in the documents.

Best regards,

Marius

9Contributor IIYours

John