"how to remove rows containing a particular string/word from an excel file?"

soham0077soham0077 Member Posts: 1 Contributor I
edited June 2019 in Help
Hi i want to delete rows which contains a specific word in excel file and get output without those rows. I am using 5.0.13 version rapid miner. i have started using rapid miner recently. can anyone suggest me how to go about it and what operators to choose?
i have read about "filter examples" operator. now having an excel file in .xls format, what will be the best way to get output without rows containing a particular word? please reply.

Answers

  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    You did already import the data via the Read Excel operator, right? Then just add a Filter Examples operator. With RapidMiner 5 you then can filter on one column. Select attribute_value_filter as condition_class. Then the parameter_string

    column1 != .*badWord.*

    will keep all rows where column1 does not contain the string "badWord".

    To match only whole words, your filter should look like this:

    column1 != != ^(.+\s)*badWord(\s.*)*$

    The cryptic syntax used here are regular expressions :) Google for that term to get more information.

    Best regards,
    Marius
Sign In or Register to comment.