I have two datasets with 4 classes, both are parameterised (tabular) versions of grain structure images... parameters are grain size, etc. The thing is, the second time that the images were parameterised, the dataset scored about 10% better than the first dataset. I now want to understand why that is and would like to compare the two datasets. However, in the visualisation, they appear to have a completely identical range, standard deviation etc. I am using Rapidminer as a tool. I looked the deviation chart and it looked almost the same. My question is now, is there a way to compare two datasets and make reliable conclusions why the one is better than the other? And what is the best way to compare them? how would you proceed?


    niemand eine idee?

    Can you adjust the jitter size on the graphs? that will give a finer graph

