RapidMiner

Extract Name Entity Recognition (NER)

Contributor II

Extract Name Entity Recognition (NER)

Hello every one,
  How we can extract person name from unstructured text using Rapidminer ?
  if we can ?
Please, tell me how ?

Thank you for every one.
7 REPLIES
Super Contributor

Re: Extract Name Entity Recognition (NER)

Rapidminer does not have such capabilities at the moment. I've tried their various information extraction operators on text but its very basic. GATE, OpenNLP, Stanford NLP etc are some tools you can use to achieve this. Also if you're comfortable trying another analytics platform, KNIME has been able to integrate some good NLP tools such as NE taggers, text annotators, and other cool operators/nodes.

Moderator

Re: Extract Name Entity Recognition (NER)

NameSor is a RapidMiner extension that's able to determine gender, ethnicity, and origin.  Maybe that will help

 

https://marketplace.rapidminer.com/UpdateServer/faces/product_details.xhtml?productId=rmx_namsor

 

 

 

 

Elite III

Re: Extract Name Entity Recognition (NER)

@batstache611 @Thomas_Ott  The Rosette text mining extension (third party but available from the marketplace) does have an operator for "extract entities", and it works with names as well as other entities.  You will need to set up a free account with them to test it.

 

Brian T., Lindon Ventures - www.lindonventures.com
Analytics Consulting by Certified RapidMiner Analysts
Moderator

Re: Extract Name Entity Recognition (NER)

Yep, there's Rosette too. I haven't spent a lot of time with it but it looks really cool. 

Super Contributor

Re: Extract Name Entity Recognition (NER)

NamSor allows you to extract, gender, and ethnicity information about a name record, it doesn't necessarily help you identify and tag an entity (poeple, place, organization, etc.) in a body of unstructured text.

Super Contributor

Re: Extract Name Entity Recognition (NER)

[ Edited ]

Thank you Brian,

 

I have already tried the features of Rosette's API from within RapidMiner and the results aren't very consistent. Entity extraction picks up garbage text as entities sometimes, sentiment analysis isn't any good at handling sarcasm or irony, etc. However, Rosette's biggest drawback is that it expects pre-processed input, i.e. the text has to be in cells in a data table, it cannot work with unstructured documents. I'm willing to understand that as well....

But when it throws me an error such as "Must contain meaningful text" even after I've brought the unstructured text data in to a table format, defined the column types in the Data Editor, and told each Rosette operator (tokenize, sentence extract, sentiment, entity extract, names, etc.) which column in the data table contains the text, that's when I start losing my faith in RM's text analytics capabilities.

 

RapidMiner should really make an effort to integrate native NLP tools based off of CoreNLP, GATE, OpenNLP, etc. that can do much more than what the standard Text Processing extension can do at the moment. I mean being a leader in Gartner's 2016 Magic Quadrant along with SAS and SPSS, one would naturally expect this out of RM as it grows. Thank you very much.

Highlighted
Contributor II

Re: Extract Name Entity Recognition (NER)

Using the Entity Extraction and Concept Extraction features in the AYLIEN Text Analysis Extension you can extract names from unstructured text. 

 

You can download the extension here

 

You can get your free AYLIEN API key here and here is a quick guide on getting started