Extract Name Entity Recognition (NER)

eng_naeleng_nael Member Posts: 4 Contributor I
Hello every one,
  How we can extract person name from unstructured text using Rapidminer ?
  if we can ?
Please, tell me how ?

Thank you for every one.

Answers

  • batstache611batstache611 Member Posts: 45 Guru

    Rapidminer does not have such capabilities at the moment. I've tried their various information extraction operators on text but its very basic. GATE, OpenNLP, Stanford NLP etc are some tools you can use to achieve this. Also if you're comfortable trying another analytics platform, KNIME has been able to integrate some good NLP tools such as NE taggers, text annotators, and other cool operators/nodes.

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    NameSor is a RapidMiner extension that's able to determine gender, ethnicity, and origin.  Maybe that will help

     

    https://marketplace.rapidminer.com/UpdateServer/faces/product_details.xhtml?productId=rmx_namsor

     

     

     

     

  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn

    @batstache611 @Thomas_Ott  The Rosette text mining extension (third party but available from the marketplace) does have an operator for "extract entities", and it works with names as well as other entities.  You will need to set up a free account with them to test it.

     

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    Yep, there's Rosette too. I haven't spent a lot of time with it but it looks really cool. 

  • batstache611batstache611 Member Posts: 45 Guru

    NamSor allows you to extract, gender, and ethnicity information about a name record, it doesn't necessarily help you identify and tag an entity (poeple, place, organization, etc.) in a body of unstructured text.

  • batstache611batstache611 Member Posts: 45 Guru

    Thank you Brian,

     

    I have already tried the features of Rosette's API from within RapidMiner and the results aren't very consistent. Entity extraction picks up garbage text as entities sometimes, sentiment analysis isn't any good at handling sarcasm or irony, etc. However, Rosette's biggest drawback is that it expects pre-processed input, i.e. the text has to be in cells in a data table, it cannot work with unstructured documents. I'm willing to understand that as well....

    But when it throws me an error such as "Must contain meaningful text" even after I've brought the unstructured text data in to a table format, defined the column types in the Data Editor, and told each Rosette operator (tokenize, sentence extract, sentiment, entity extract, names, etc.) which column in the data table contains the text, that's when I start losing my faith in RM's text analytics capabilities.

     

    RapidMiner should really make an effort to integrate native NLP tools based off of CoreNLP, GATE, OpenNLP, etc. that can do much more than what the standard Text Processing extension can do at the moment. I mean being a leader in Gartner's 2016 Magic Quadrant along with SAS and SPSS, one would naturally expect this out of RM as it grows. Thank you very much.

  • AYLIENAYLIEN Member Posts: 4 Contributor I

    Using the Entity Extraction and Concept Extraction features in the AYLIEN Text Analysis Extension you can extract names from unstructured text. 

     

    You can download the extension here

     

    You can get your free AYLIEN API key here and here is a quick guide on getting started

     

     

Sign In or Register to comment.