I am trying to filter rows in a dataset that contain characters such as ' | and @. Which operator is best suited to this and also how would this regular expression be be written?
In the following I generate examples with value1..value10 as attribute values throughout. Then value1 and value2 have the offending characters inserted. Finally a regex replacement finds those edits and sets them to missing values, and examples with missing values are trashed.
Answers
In the following I generate examples with value1..value10 as attribute values throughout. Then value1 and value2 have the offending characters inserted. Finally a regex replacement finds those edits and sets them to missing values, and examples with missing values are trashed. If you play around with the ( new in V5 ) regex parameter editor you'll soon see just how useful, but impenetrable, regex can be.
I have some experience with the old RapidMiner versions and am still getting to grips with V5, but its a marvelous tool.