[SOLVED] negative distances

MarcosRLMarcosRL Member Posts: 53 Contributor II
edited November 2018 in Help
Hello friends of the community I have doubts about the distances:
Why the average distances are negative (Avg. Within distance_cluster centroid)?
One of the properties of the distances is not always positive?

regard

Answers

  • MarcosRLMarcosRL Member Posts: 53 Contributor II
    Hello, I wanted to ask if they could see the issue of negative distances
    It is a bug?
    Please confirm that I need so I see if I keep using this tool to measure the distance, because it is for an academic and my deadlines expire
    Regards
  • Nils_WoehlerNils_Woehler Member Posts: 463 Maven
    Hi,

    can you please post a process so that we can reproduce this?

    Best,
    Nils
  • MarcosRLMarcosRL Member Posts: 53 Contributor II
    the problem is not in the process, but in the data, when there is a null or zero yields seemingly negative distances.
    I change the tools in case XD
    regards
  • Nils_WoehlerNils_Woehler Member Posts: 463 Maven
    Can you provide sample data to reproduce this? Otherwise we cannot check if it is a bug and what is happening.

    Best,
    Nils
  • MarcosRLMarcosRL Member Posts: 53 Contributor II
    Here are the sample data in vector form, because I can not attach the excel
    I can not excel adjuntarte one, either the data vector because it exceeds the capacity of the message, is there any alternative way to send the data?
    regards
  • Nils_WoehlerNils_Woehler Member Posts: 463 Maven
    Hi,

    I could reproduce your negative distances with the Performance (Cluster Distance Performance) operator. But this is not a bug, it is meant to work this way because the distances are multiplied by -1 to allow using them for optimization. If you want to see the positive distances you should select the 'maximize' parameter. But you should not use the resulting performance objects for optimization if you have selected this parameter!

    The reason for multiplying by -1: The Performance (Cluster Distance Performance) calculates the average distance within centroids. The smaller the distances are the better the clustering works (in theory). But our optimization operators always try to maximize the performance of an algorithm. This means if you don't multiply be -1, the optimization algorithm would always prefer cluster results with a higher average distance within centroids.

    Best,
    Nils
  • MarcosRLMarcosRL Member Posts: 53 Contributor II
    Thank you very much  :D
    now I understand  :)
    Regards
Sign In or Register to comment.