The RapidMiner community is on read-only mode until further notice. Technical support via cases will continue to work as is. For any urgent licensing related requests from Students/Faculty members, please use the Altair academic forum here.
pdf to database
r_esmaeilzadeh1
Member Posts: 1 Newbie
Hellow everyone
I am a new member and had studies about the software but I have a problem:
I need to read a lot of PDFs, delete the references sections, categorize them by year of publication, and then do the text mining and found The most repetitive words.
how can I do that?
Thanks in advance for your guidance
0
Answers
Next you can use the replace operators and regex to strip what you don't need and use the document to data operators for the mining part.