I answer myself and share some piece of knowledge with you...
To calculate the SSE measure I wrote a small script in the Execute Script Operator. The script is this;
/** * Author: N@v * Version: 0.0.1 * Date: 11/01/2012 * * Description: * This script permits to calculate the SSE measure of a given clustering coming out from a centroid-based clustering algorithm. * * Input: * input[0]: the example set of the clustering * input[1]: the example set of the centroids * input[2]: the cluster model of the cluster operator * * Output: * The SSE value of the clustering will be displayed in log consolle. **/
Double sum = new Double(0); centroids.remapIds(); TreeMap<Integer,Example> centroidMap = new TreeMap<Integer, Example>(); for (Example centroid : centroids) { String key = centroid.getValueAsString(centroid.getAttributes().get("cluster")); key = key.substring(8); Cluster cluster = clustering.getCluster(Integer.parseInt(key));
if (cluster.getNumberOfExamples() == 0) { continue; } else { Collection<Object> idsList = cluster.getExampleIds(); clusteringSet.remapIds(); for (Object id : idsList) { Example example = clusteringSet.getExampleFromId((Double) id); distance = new Double(calculateEuclideanDistance(centroid, example)); sum += distance*distance; } } } operator.logNote("SSE: " + sum);
Double calculateEuclideanDistance(Example a, Example b) { Attribute[] atts = a.getAttributes().createRegularAttributeArray(); Double sum = new Double(0); Double dist = new Double(0); for (Attribute att : atts){ String attStr = att.getName(); Double aValue = new Double(a.getValue(a.getAttributes().get(attStr))); Double bValue = new Double(b.getValue(b.getAttributes().get(attStr))); Double difference = new Double(aValue - bValue); sum += Math.pow(difference,2); } dist = Math.sqrt(sum);
return dist; }
The script is very simple and at now it works only for centroid-based algorithm but I plan to adapt it to general case... it needs only to calculate the centroids by hand.
If you have any comments or suggestions let me know.
Hello I used the script but i got this error "cannot cast object'cluster 0:..items cluster 1:..items Total number of items with class'com.rapidminer.operator.clustering.ClusterModel' to class 'com.rapidminer.example.ExampeSet'
Answers
To calculate the SSE measure I wrote a small script in the Execute Script Operator. The script is this; The script is very simple and at now it works only for centroid-based algorithm but I plan to adapt it to general case... it needs only to calculate the centroids by hand.
If you have any comments or suggestions let me know.
I used the script but i got this error
"cannot cast object'cluster 0:..items cluster 1:..items Total number of items with class'com.rapidminer.operator.clustering.ClusterModel' to class 'com.rapidminer.example.ExampeSet'
thanks