Filter out rows with dictionary terms

tatianiiatatianiia Member Posts: 11 Contributor II
edited June 2019 in Help
Hi! I have a problem that appeared quite simple at first glance, but I really don't know how to solve it.
  • I have a file with several text attributes (attr1, attr2)
  • I have a long list of terms in another file.
I need to derive only those rows from the first file that don't contain in attr.2 any of the terms from the second file.

So, if the first file contains:
attr.1    attr.2
Sun      Sun is shining
Rain    Rain is falling

and the second contains:
attr.1
Sun
Moon

I want to get the second row from the first file as an output.

I guess there must be some easy solution for that. Thanks!

Answers

  • tatianiiatatianiia Member Posts: 11 Contributor II
    My solution is:
    1. Apply "Process documents" operator with dictionary terms as a word list to create a binary vector;
    2. Filter out examples that don't have positive attributes.

    Not really sure that this solution is the best one.
Sign In or Register to comment.