I am very new to text mining, and just started to try out RapidMiner.

I saw in descriptions that it is capable of named entity recognition, but didn't find an example about this.

Is it possible to use RapidMiner giving him text documents, and getting entity recognition, for exemple to find out which 'computer domain expressions' are used in the documents. This would allow to know that this document talks about 'electronic management of documents', 'automatic archiving', etc.

If so, would it mean that first a corpus of computer domain terms has to be built ? are there some available ?

And finally, would it be possible to do the same on chinese texts ?

Thanks for any information


    Your solution would be GATE- General Architecture for Text Engineering.

