Options

kNN with Optimize Parameters (Grid)

User8259User8259 Member, University Professor Posts: 8 University Professor
Trying a few Split-Validation experiments using Optimize Parameters (Grid) with kNN. In the following, everything is held the same except the changes noted below and the results, which are inconsistent:

Batch 1:-

Run 1. k = 1, 3, 5, ... ,25:
Run 2. k = 1,3,5.
Run 3. k =1.

The inconsistency is with the results  (Accuracy, Kappa, F-Measure) for k = 1.

Run 1 produces different results than Runs 2 and 3 despite all else being held fixed.
Run 2 differs from Run 1 only when Local Seed is 1. They agree for the remaining seed choices.
Runs  1 & 2 results agree for k = 3 & 5.

Because the problem appeared to manifest with k -1, I tried a few runs but started with k = 3, instead of 1.

Batch 2:-

Run 1. k = 3, 5, ...., 25
Run 2. k = 3, 5, 7, 9, 11.
Run 3. k = 3.

Again, mutual inconsistencies showed up only with k = 3.

Notably, the same results showed up for k = 3 and Local Seed = 1 in Run 1 and k = 3 and Local Seed = 11 in Run 2. There may be other such peculiarities but this caught my eye.

The Seed = 1 and Seed = 11 results for the two runs are not the same but the Grid results for Seed 1 and Seed 11 "criss-cross" as just mentioned between the two runs.

And, the results for k = 3 from the second batch of three runs do not match the results for k = 3 from the first batch.

As stated at the outset, all else is exactly the same in these runs, to my knowledge. Am I missing something obvious?

I am using the same platform to run these. I can share the input file, the respective process files, and the results logged into an Excel sheet for ease of comparison via email, with anybody who wants to take a look. Pl. send me an email address.

Thanks!


 




Best Answer

  • Options
    User8259User8259 Member, University Professor Posts: 8 University Professor
    Solution Accepted
    An Update on the Above:

    Found that if I specified "Use Local Random Seed" and then selected a value of "true" for this parameter, then I get consistent results between the runs. In earlier versions, if I just specified "Local Random Seed" values to use, RMS apparently understood that "Use Local Random Seed" was "true." I did not have to also specify "Use Local Random Seed." Hopefully, the results I am getting are also correct, but they certainly are mutually consistent.
Sign In or Register to comment.