Options

Remove Numeric and other data from Text Mining

aksaks Member Posts: 3 Newbie
Hi, I am new user to RP. I have imported a file for sentiment analysis. It is a financial file, I want to remove the number ($, 0, 1, ..9) from the loaded file. Which operator should I use? Thanks in advance.

Answers

  • Options
    aksaks Member Posts: 3 Newbie
    RP(Rapidminer Platform)
  • Options
    kaymankayman Member Posts: 662 Unicorn
    Use the replace token operator.

    If you click the edit icon and then the drop down you have a few pre-selections, usually the punctuation character (replace with spaceor so) works fine in these cases, you may want to add the number range 0-9 also if it's needed



  • Options
    Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    You can also just use the Replace operator on the text before tokenizing and remove [0-9]+ from the attribute(s) in question.

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
Sign In or Register to comment.