🥳 RAPIDMINER 9.9 IS OUT!!! 🥳
The updates in 9.9 power advanced use cases and offer productivity enhancements for users who prefer to code.
RapidMiner 4.2 MPCK-Means random results
I know nothing about code and i needed to implement MPCK-Means to solve a problem, so I was happy to find the implementation of this algorithm in RapidMiner 4.2. I have 83 instances and dimensions go from 1 to 5 (i have several sets of variables to cluster the same instances separately). Among those 83 instances, I have 14 "neighborhoods" of points connected by must-link constraints and I don't have any cannot-link constraints. However, I just discovered that the results I had with this algorithm changed completely when I repeated the process with the same dataset and the same parameter settings again after some time. And now I am noticing the clusters change if I change the random seed. I thought it could be from different starting seeds, but the algorithm supposedly uses "farthest first" instead of random initialization (as it is described in the original paper). I tried repeating the process with 1000 initializations (the default is 5) and 100 iterations per run and yet the results change when I change the random number seed! With the same seed I get consistently the same clusters today but tomorrow I may have consistently a different clustering result with the same parameters and fixed seed. This is a bit scary because I now I can't decide on which cluster results should I trust! Why is this happening? Is it only my data? Is this the reason why the clusterer was removed in RM 4.3?