[SOLVED] Clustering SSE measure calculation

navnav Member Posts: 28  Maven
edited November 2018 in Help
Hi,

does anyone know how to calculate the SSE value of a clustering in rapidminer?

Answers

  • navnav Member Posts: 28  Maven
    I answer myself and share some piece of knowledge with you...

    To calculate the SSE measure I wrote a small script in the Execute Script Operator. The script is this;
    /**
    * Author: [email protected]
    * Version: 0.0.1
    * Date: 11/01/2012
    *
    * Description:
    * This script permits to calculate the SSE measure of a given clustering coming out from a centroid-based clustering algorithm.
    *
    * Input:
    * input[0]: the example set of the clustering
    * input[1]: the example set of the centroids
    * input[2]: the cluster model of the cluster operator
    *
    * Output:
    * The SSE value of the clustering will be displayed in log consolle.
    **/

    import com.rapidminer.operator.clustering.ClusterModel;
    import com.rapidminer.operator.clustering.Cluster;

    ExampleSet clusteringSet = input[0];
    ExampleSet centroids = input[1];
    ClusterModel clustering = input[2];

    Double sum = new Double(0);
    centroids.remapIds();
    TreeMap<Integer,Example> centroidMap = new TreeMap<Integer, Example>();
    for (Example centroid : centroids) {
    String key = centroid.getValueAsString(centroid.getAttributes().get("cluster"));
    key = key.substring(8);
    Cluster cluster = clustering.getCluster(Integer.parseInt(key));

    if (cluster.getNumberOfExamples() == 0) {
    continue;
    }
    else {
    Collection<Object> idsList = cluster.getExampleIds();
    clusteringSet.remapIds();
    for (Object id : idsList) {
    Example example = clusteringSet.getExampleFromId((Double) id);
    distance = new Double(calculateEuclideanDistance(centroid, example));
    sum += distance*distance;
    }
    }
    }
    operator.logNote("SSE: " + sum);


    Double calculateEuclideanDistance(Example a, Example b)
    {
    Attribute[] atts = a.getAttributes().createRegularAttributeArray();
    Double sum = new Double(0);
    Double dist = new Double(0);
    for (Attribute att : atts){
    String attStr = att.getName();
    Double aValue = new Double(a.getValue(a.getAttributes().get(attStr)));
    Double bValue = new Double(b.getValue(b.getAttributes().get(attStr)));
    Double difference = new Double(aValue - bValue);
    sum += Math.pow(difference,2);
    }
    dist = Math.sqrt(sum);

    return dist;
    }
    The script is very simple and at now it works only for centroid-based algorithm but I plan to adapt it to general case... it needs only to calculate the centroids by hand.

    If you have any comments or suggestions let me know.
  • Asmaa_AliAsmaa_Ali Member Posts: 1 Contributor I
    edited July 2019
    Hello 
    I used the script but i got this error 
    "cannot cast object'cluster 0:..items cluster 1:..items Total number of items with class'com.rapidminer.operator.clustering.ClusterModel' to class 'com.rapidminer.example.ExampeSet'

    thanks
Sign In or Register to comment.