Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

[SOLVED] Clustering SSE measure calculation

navnav Member Posts: 28 Contributor II
edited November 2018 in Help
Hi,

does anyone know how to calculate the SSE value of a clustering in rapidminer?

Answers

  • navnav Member Posts: 28 Contributor II
    I answer myself and share some piece of knowledge with you...

    To calculate the SSE measure I wrote a small script in the Execute Script Operator. The script is this;
    /**
    * Author: N@v
    * Version: 0.0.1
    * Date: 11/01/2012
    *
    * Description:
    * This script permits to calculate the SSE measure of a given clustering coming out from a centroid-based clustering algorithm.
    *
    * Input:
    * input[0]: the example set of the clustering
    * input[1]: the example set of the centroids
    * input[2]: the cluster model of the cluster operator
    *
    * Output:
    * The SSE value of the clustering will be displayed in log consolle.
    **/

    import com.rapidminer.operator.clustering.ClusterModel;
    import com.rapidminer.operator.clustering.Cluster;

    ExampleSet clusteringSet = input[0];
    ExampleSet centroids = input[1];
    ClusterModel clustering = input[2];

    Double sum = new Double(0);
    centroids.remapIds();
    TreeMap<Integer,Example> centroidMap = new TreeMap<Integer, Example>();
    for (Example centroid : centroids) {
    String key = centroid.getValueAsString(centroid.getAttributes().get("cluster"));
    key = key.substring(8);
    Cluster cluster = clustering.getCluster(Integer.parseInt(key));

    if (cluster.getNumberOfExamples() == 0) {
    continue;
    }
    else {
    Collection<Object> idsList = cluster.getExampleIds();
    clusteringSet.remapIds();
    for (Object id : idsList) {
    Example example = clusteringSet.getExampleFromId((Double) id);
    distance = new Double(calculateEuclideanDistance(centroid, example));
    sum += distance*distance;
    }
    }
    }
    operator.logNote("SSE: " + sum);


    Double calculateEuclideanDistance(Example a, Example b)
    {
    Attribute[] atts = a.getAttributes().createRegularAttributeArray();
    Double sum = new Double(0);
    Double dist = new Double(0);
    for (Attribute att : atts){
    String attStr = att.getName();
    Double aValue = new Double(a.getValue(a.getAttributes().get(attStr)));
    Double bValue = new Double(b.getValue(b.getAttributes().get(attStr)));
    Double difference = new Double(aValue - bValue);
    sum += Math.pow(difference,2);
    }
    dist = Math.sqrt(sum);

    return dist;
    }
    The script is very simple and at now it works only for centroid-based algorithm but I plan to adapt it to general case... it needs only to calculate the centroids by hand.

    If you have any comments or suggestions let me know.
  • Asmaa_AliAsmaa_Ali Member Posts: 1 Learner II
    edited July 2019
    Hello 
    I used the script but i got this error 
    "cannot cast object'cluster 0:..items cluster 1:..items Total number of items with class'com.rapidminer.operator.clustering.ClusterModel' to class 'com.rapidminer.example.ExampeSet'

    thanks
Sign In or Register to comment.