Filtering of documents using document similarity

June 23
I have 100 comments that I process and run Document Similarity against. Works great, but I'm only interested in finding documents containing specific words, say "workload". So I filter the example set, rerun the process, and document similarity gives me the results on those documents only containing "workload". Perfect. Problem is, by filtering it also remaps the IDs of the documents to the filtered set so I no longer know their original document IDs. This makes it very difficult to find the originals because a .73 similarity between doc ids 1 and 17 in the filtered set does not map to documents 1 and 17 in the original non-filtered data set. 

Is there a way to keep the original IDs in the filtered dataset?
