[SOLVED] Clustering SSE measure calculation

Member Posts: 28 Maven
edited November 2018 in Help
Hi,

does anyone know how to calculate the SSE value of a clustering in rapidminer?

• Member Posts: 28 Maven
I answer myself and share some piece of knowledge with you...

To calculate the SSE measure I wrote a small script in the Execute Script Operator. The script is this;
/**
* Author: [email protected]
* Version: 0.0.1
* Date: 11/01/2012
*
* Description:
* This script permits to calculate the SSE measure of a given clustering coming out from a centroid-based clustering algorithm.
*
* Input:
* input: the example set of the clustering
* input: the example set of the centroids
* input: the cluster model of the cluster operator
*
* Output:
* The SSE value of the clustering will be displayed in log consolle.
**/

import com.rapidminer.operator.clustering.ClusterModel;
import com.rapidminer.operator.clustering.Cluster;

ExampleSet clusteringSet = input;
ExampleSet centroids = input;
ClusterModel clustering = input;

Double sum = new Double(0);
centroids.remapIds();
TreeMap<Integer,Example> centroidMap = new TreeMap<Integer, Example>();
for (Example centroid : centroids) {
String key = centroid.getValueAsString(centroid.getAttributes().get("cluster"));
key = key.substring(8);
Cluster cluster = clustering.getCluster(Integer.parseInt(key));

if (cluster.getNumberOfExamples() == 0) {
continue;
}
else {
Collection<Object> idsList = cluster.getExampleIds();
clusteringSet.remapIds();
for (Object id : idsList) {
Example example = clusteringSet.getExampleFromId((Double) id);
distance = new Double(calculateEuclideanDistance(centroid, example));
sum += distance*distance;
}
}
}
operator.logNote("SSE: " + sum);

Double calculateEuclideanDistance(Example a, Example b)
{
Attribute[] atts = a.getAttributes().createRegularAttributeArray();
Double sum = new Double(0);
Double dist = new Double(0);
for (Attribute att : atts){
String attStr = att.getName();
Double aValue = new Double(a.getValue(a.getAttributes().get(attStr)));
Double bValue = new Double(b.getValue(b.getAttributes().get(attStr)));
Double difference = new Double(aValue - bValue);
sum += Math.pow(difference,2);
}
dist = Math.sqrt(sum);

return dist;
}
The script is very simple and at now it works only for centroid-based algorithm but I plan to adapt it to general case... it needs only to calculate the centroids by hand.

If you have any comments or suggestions let me know.
• Member Posts: 1 Contributor I
edited July 25
Hello
I used the script but i got this error
"cannot cast object'cluster 0:..items cluster 1:..items Total number of items with class'com.rapidminer.operator.clustering.ClusterModel' to class 'com.rapidminer.example.ExampeSet'

thanks