RapidMiner 9.8 Beta is now available
Be one of the first to get your hands on the new features. More details and downloads here:
Text Mining: analyse PDFs with a dictionary which has categories
I want to analyse a number of PDFs (35) with kind of a dictionary. The output of the analysis should be an Excel File which shows how often every single word of the dictionary appears in the PDFs. Maybe it's important to know that the dictionary is not only a list of words. Instead the words are classified into five categories. Thus the analysis should give me information about how much is reported on the words of the dictionary and about which category is reported the most.
I already read lots of questions here and also watched tutorials, but I could not find exactly what I need. Trial and error didn't work as well up to now. Hope someone can help me.
Many thanks in advance,