filter a list of words with filter token by content operator

waqaskhan343waqaskhan343 Member Posts: 11 Contributor I
edited October 2019 in Help

How to filter a list of words by using filter token by content operator I have a list of 300 words but the operator can take only one word, I have read all of the solution related to this problem but no one is working for me.

thank you in advance

Best Answer

  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    Solution Accepted

    With that volume of words I would suggest the Custom Stopword dictionary option I mentioned.  Just create a simple text file with all the words you don't want (each word on a separate row) and connect that as your external dictionary file.  Only the words that are not on that list will be retained.

     

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts

Answers

  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn

    You can use "Replace Tokens" first to change every token you want to eliminate (this operator takes multiple rows of inputs) into something like "replaceme" and then just use a single "Filter Tokens by Content" on "replaceme" after those substitutions.

    Or you can use the "Filter Stopwords (Dictionary)" operator and use an external file of words to eliminate.

     

     

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • waqaskhan343waqaskhan343 Member Posts: 11 Contributor I

    Telcontar120 thank you for your reply

    I have 1500 token from which i only want to 300 specific and all other exclude from the example set how can I use replace in this scenario?  

  • waqaskhan343waqaskhan343 Member Posts: 11 Contributor I

    @Telcontar120 oh yes it's working, thank you for help. You save my life :)

Sign In or Register to comment.