Hello friends of the community I have doubts about the distances: Why the average distances are negative (Avg. Within distance_cluster centroid)? One of the properties of the distances is not always positive?

Hello, I wanted to ask if they could see the issue of negative distances It is a bug? Please confirm that I need so I see if I keep using this tool to measure the distance, because it is for an academic and my deadlines expire Regards

the problem is not in the process, but in the data, when there is a null or zero yields seemingly negative distances. I change the tools in case XD regards

Here are the sample data in vector form, because I can not attach the excel I can not excel adjuntarte one, either the data vector because it exceeds the capacity of the message, is there any alternative way to send the data? regards

I could reproduce your negative distances with the Performance (Cluster Distance Performance) operator. But this is not a bug, it is meant to work this way because the distances are multiplied by -1 to allow using them for optimization. If you want to see the positive distances you should select the 'maximize' parameter. But you should not use the resulting performance objects for optimization if you have selected this parameter!

The reason for multiplying by -1: The Performance (Cluster Distance Performance) calculates the average distance within centroids. The smaller the distances are the better the clustering works (in theory). But our optimization operators always try to maximize the performance of an algorithm. This means if you don't multiply be -1, the optimization algorithm would always prefer cluster results with a higher average distance within centroids.

## Answers

53Contributor IIIt is a bug?

Please confirm that I need so I see if I keep using this tool to measure the distance, because it is for an academic and my deadlines expire

Regards

463Mavencan you please post a process so that we can reproduce this?

Best,

Nils

53Contributor III change the tools in case XD

regards

463MavenBest,

Nils

53Contributor III can not excel adjuntarte one, either the data vector because it exceeds the capacity of the message, is there any alternative way to send the data?

regards

463MavenI could reproduce your negative distances with the Performance (Cluster Distance Performance) operator. But this is not a bug, it is meant to work this way because the distances are multiplied by -1 to allow using them for optimization. If you want to see the positive distances you should select the 'maximize' parameter. But you should not use the resulting performance objects for optimization if you have selected this parameter!

The reason for multiplying by -1: The Performance (Cluster Distance Performance) calculates the average distance within centroids. The smaller the distances are the better the clustering works (in theory). But our optimization operators always try to maximize the performance of an algorithm. This means if you don't multiply be -1, the optimization algorithm would always prefer cluster results with a higher average distance within centroids.

Best,

Nils

53Contributor IInow I understand

Regards