RapidMiner

Text mining

Contributor

Text mining

Hi,

I 'm new to Rapidminer and I don't quite cope with it yet.
What I want to do: I have about 300 pdf documents and one wordlist with about 100 different words. I want to find out the total occurrency of these words for each pdf document. And I would like to know the total number of words each pdf ducument contains.

Can somebody help me with modelling the process?

Thanks in anvance.
3 REPLIES
Contributor

Re: Text mining

Hello,

Did you finally find the process ? Would you please share it ?
I have the same concerns with many pdf documents.

Thank you
Moderator

Re: Text mining

Hi,
the operator to read pdf files is read Document. You can combine that with Loop Files to read several files.

Best,
Martin
--------------------------------------------------------------------------
Head of Data Science Services at RapidMiner
Contributor

Re: Text mining