Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

What would be the best operator to use to filter/remove bad data for Sentiment Analysis?

fishmansffishmansf Member Posts: 1 Learner I
I have a data set that is a list of reviews for the Spotify app but some of the data is either corrupted and outputs non-numerical, alphabetical, or character data as seen below.  What would be a good operator to remove this kind of data?  Thank you!

Tagged:

Answers

  • MarcoBarradasMarcoBarradas Administrator, Employee, RapidMiner Certified Analyst, Member Posts: 272 Unicorn
    Hi @fishmansf

    You can use a filter example to set the filtering rule as a match and define a Regex that looks for these non word characters and remove those from your example set. 

Sign In or Register to comment.