🦉 🎤   RapidMiner Wisdom 2020 - CALL FOR SPEAKERS   🦉 🎤

We are inviting all community members to submit proposals to speak at Wisdom 2020 in Boston.


Whether it's a cool RapidMiner trick or a use case implementation, we want to see what you have.
Form link is below and deadline for submissions is November 15. See you in Boston!

CLICK HERE TO GO TO ENTRY FORM

[SOLVED] Clustering SSE measure calculation

navnav Member Posts: 28  Maven
edited November 2018 in Help
Hi,

does anyone know how to calculate the SSE value of a clustering in rapidminer?

Answers

  • navnav Member Posts: 28  Maven
    I answer myself and share some piece of knowledge with you...

    To calculate the SSE measure I wrote a small script in the Execute Script Operator. The script is this;
    /**
    * Author: [email protected]
    * Version: 0.0.1
    * Date: 11/01/2012
    *
    * Description:
    * This script permits to calculate the SSE measure of a given clustering coming out from a centroid-based clustering algorithm.
    *
    * Input:
    * input[0]: the example set of the clustering
    * input[1]: the example set of the centroids
    * input[2]: the cluster model of the cluster operator
    *
    * Output:
    * The SSE value of the clustering will be displayed in log consolle.
    **/

    import com.rapidminer.operator.clustering.ClusterModel;
    import com.rapidminer.operator.clustering.Cluster;

    ExampleSet clusteringSet = input[0];
    ExampleSet centroids = input[1];
    ClusterModel clustering = input[2];

    Double sum = new Double(0);
    centroids.remapIds();
    TreeMap<Integer,Example> centroidMap = new TreeMap<Integer, Example>();
    for (Example centroid : centroids) {
    String key = centroid.getValueAsString(centroid.getAttributes().get("cluster"));
    key = key.substring(8);
    Cluster cluster = clustering.getCluster(Integer.parseInt(key));

    if (cluster.getNumberOfExamples() == 0) {
    continue;
    }
    else {
    Collection<Object> idsList = cluster.getExampleIds();
    clusteringSet.remapIds();
    for (Object id : idsList) {
    Example example = clusteringSet.getExampleFromId((Double) id);
    distance = new Double(calculateEuclideanDistance(centroid, example));
    sum += distance*distance;
    }
    }
    }
    operator.logNote("SSE: " + sum);


    Double calculateEuclideanDistance(Example a, Example b)
    {
    Attribute[] atts = a.getAttributes().createRegularAttributeArray();
    Double sum = new Double(0);
    Double dist = new Double(0);
    for (Attribute att : atts){
    String attStr = att.getName();
    Double aValue = new Double(a.getValue(a.getAttributes().get(attStr)));
    Double bValue = new Double(b.getValue(b.getAttributes().get(attStr)));
    Double difference = new Double(aValue - bValue);
    sum += Math.pow(difference,2);
    }
    dist = Math.sqrt(sum);

    return dist;
    }
    The script is very simple and at now it works only for centroid-based algorithm but I plan to adapt it to general case... it needs only to calculate the centroids by hand.

    If you have any comments or suggestions let me know.
  • Asmaa_AliAsmaa_Ali Member Posts: 1 Contributor I
    edited July 25
    Hello 
    I used the script but i got this error 
    "cannot cast object'cluster 0:..items cluster 1:..items Total number of items with class'com.rapidminer.operator.clustering.ClusterModel' to class 'com.rapidminer.example.ExampeSet'

    thanks
Sign In or Register to comment.