Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

How to check if some specific indicators are mentioned in a set of business reports?

TENATENA Member Posts: 1 Learner I

We are analysing some business annual reports (13 reports in pdf format). We are new in using Rapidminer, but thanks to the training resources and the answers in the community we managed to run a cluster analysis of some parts of the annual reports we are interested in. In this kind of analysis we used the operator Process operator documents from files  to extract the words, which are then used by the clustering operator.
Now we are interested in a different analysis, since we do not want Rapidminer to extract the list of word from the reports, but we have already a given wordlist, since we want to check if a list of given indicators (words) are mentioned or not in the business reports. However, I have not seen any example to learn how to create a process to get this result. I would be very grateful if you could help me by giving some example or indication of the operators to be used.
Thanks


Answers

  • lionelderkrikorlionelderkrikor RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
    Hi @TENA,

    I worked on a similar project some months ago.
    The process in attached file extract the sentense(s) of the report where the keyword(s) appear(s).
    To run the process in attached file, you will need : 
     - to install Python on your computer
     - Install the Python Scripting  extension

    If this process is not adapted to your use case, please provide at least 2 representative pdf reports and a list 
    of indicators(words).

    Regards,

    Lionel
     
Sign In or Register to comment.