pdf to database


Hellow everyone
I am a new member and had studies about the software but I have a problem:
I need to read a lot of PDFs, delete the references sections, categorize them by year of publication, and then do the text mining and found The most repetitive words.
how can I do that?
Thanks in advance for your guidance
0
Answers
Next you can use the replace operators and regex to strip what you don't need and use the document to data operators for the mining part.