kNN with Optimize Parameters (Grid)
Trying a few SplitValidation experiments using Optimize Parameters (Grid) with kNN. In the following, everything is held the same except the changes noted below and the results, which are inconsistent:
Batch 1:
Run 1. k = 1, 3, 5, ... ,25:
Run 2. k = 1,3,5.
Run 3. k =1.
The inconsistency is with the results (Accuracy, Kappa, FMeasure) for k = 1.
Run 1 produces different results than Runs 2 and 3 despite all else being held fixed.
Run 2 differs from Run 1 only when Local Seed is 1. They agree for the remaining seed choices.
Runs 1 & 2 results agree for k = 3 & 5.
Because the problem appeared to manifest with k 1, I tried a few runs but started with k = 3, instead of 1.
Batch 2:
Run 1. k = 3, 5, ...., 25
Run 2. k = 3, 5, 7, 9, 11.
Run 3. k = 3.
Again, mutual inconsistencies showed up only with k = 3.
Notably, the same results showed up for k = 3 and Local Seed = 1 in Run 1 and k = 3 and Local Seed = 11 in Run 2. There may be other such peculiarities but this caught my eye.
The Seed = 1 and Seed = 11 results for the two runs are not the same but the Grid results for Seed 1 and Seed 11 "crisscross" as just mentioned between the two runs.
And, the results for k = 3 from the second batch of three runs do not match the results for k = 3 from the first batch.
As stated at the outset, all else is exactly the same in these runs, to my knowledge. Am I missing something obvious?
I am using the same platform to run these. I can share the input file, the respective process files, and the results logged into an Excel sheet for ease of comparison via email, with anybody who wants to take a look. Pl. send me an email address.
Thanks!
Batch 1:
Run 1. k = 1, 3, 5, ... ,25:
Run 2. k = 1,3,5.
Run 3. k =1.
The inconsistency is with the results (Accuracy, Kappa, FMeasure) for k = 1.
Run 1 produces different results than Runs 2 and 3 despite all else being held fixed.
Run 2 differs from Run 1 only when Local Seed is 1. They agree for the remaining seed choices.
Runs 1 & 2 results agree for k = 3 & 5.
Because the problem appeared to manifest with k 1, I tried a few runs but started with k = 3, instead of 1.
Batch 2:
Run 1. k = 3, 5, ...., 25
Run 2. k = 3, 5, 7, 9, 11.
Run 3. k = 3.
Again, mutual inconsistencies showed up only with k = 3.
Notably, the same results showed up for k = 3 and Local Seed = 1 in Run 1 and k = 3 and Local Seed = 11 in Run 2. There may be other such peculiarities but this caught my eye.
The Seed = 1 and Seed = 11 results for the two runs are not the same but the Grid results for Seed 1 and Seed 11 "crisscross" as just mentioned between the two runs.
And, the results for k = 3 from the second batch of three runs do not match the results for k = 3 from the first batch.
As stated at the outset, all else is exactly the same in these runs, to my knowledge. Am I missing something obvious?
I am using the same platform to run these. I can share the input file, the respective process files, and the results logged into an Excel sheet for ease of comparison via email, with anybody who wants to take a look. Pl. send me an email address.
Thanks!
0
Best Answer

User8259 Member, University Professor Posts: 8 University ProfessorAn Update on the Above:
Found that if I specified "Use Local Random Seed" and then selected a value of "true" for this parameter, then I get consistent results between the runs. In earlier versions, if I just specified "Local Random Seed" values to use, RMS apparently understood that "Use Local Random Seed" was "true." I did not have to also specify "Use Local Random Seed." Hopefully, the results I am getting are also correct, but they certainly are mutually consistent.1