"Can a groovy script count clusters?"

awchisholmawchisholm RapidMiner Certified Expert, Member Posts: 458 Unicorn
edited May 2019 in Help
Hello all,

The Cluster Count Performance operator returns very odd values. I decided to look at the code to see what was going on and I noticed these lines in the file 'ClusterNumberEvaluator.java' at about line 90

for (int i = 0; i < model.getNumberOfClusters(); i++)
          numItems = +model.getCluster(i).getNumberOfExamples();
numitems is set to one more than the number of examples in the last cluster.

This gets used later in this line

PerformanceCriterion pc = new EstimatedPerformance("Number of clusters", 1.0 - (((double) model.getNumberOfClusters()) / ((double) numItems)), 1, false);
So leads to weird values. I think  numItems += model should fix it.

Anyway my question, before I embark on it, will it be possible to use the Groovy scrting operator to calcuate this myself?




  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    ok, that's a typo causing a lot of headache :) I corrected this by removing the blank between = and +...

    Additionally I have changed the behavior of the operator so that it now returns two criterions one containing the actual number.

    And of course you can do this with the Grovy Scripting operator.

  • Options
    awchisholmawchisholm RapidMiner Certified Expert, Member Posts: 458 Unicorn
    Hello Sebastian

    If you say it can be done with Groovy then I'll try it.


Sign In or Register to comment.