Use pdf file name as attribute

vrobavroba Member Posts: 5 Newbie
Hello everyone :smile:

I want to do some simple Text Mining using pdf files in RM but I'm a little stuck right now. 

I created a process using the loop files and process document operator for reading in several pdf files.
As I have a lot of files to analyze, which I also want to compare, I would like to create an attribute which includes the file name to keep track of everything. 

I enabled macros and tried to include the file name by generating a new attribute.
The problem is that the generated attribute only consists of the file name of the last file I uploaded and not the name of the corresponding document. How can I ensure that the attribute value is the respective file name of the document?  

Or is there a way to just include the metadata_file as an attribute? 

I included my process and the first 5 files I want to read.
I would really appreciate every help, thank you already in advance! 

Best Answer

  • jwpfaujwpfau Employee, Member Posts: 270 RM Engineering
    Solution Accepted
    Hi Veronika,

    yes, you can select Process  → Synchronize Meta Data with Real Data.

    But then you have to run it once to populate the Meta Data.

    Greetings,
    Jonas
    vroba

Answers

  • jwpfaujwpfau Employee, Member Posts: 270 RM Engineering
    Hi,

    couldn't you throw out the surplus metadata attributes with

    Select Attributes
    type exclude attributes
    attribute filter type: subset
    select subset: select the metadata fields that you don't need

    Greetings,
    Jonas
    MartinLiebig
  • vrobavroba Member Posts: 5 Newbie
    Hi Jonas, 

    thank you for your answer! 

    I'm not sure what exactly you mean, because the metadata attributes don't show up in the select attributes operator. 
    Is there a way to turn metadata into "real" data? 

    Greetings
    Veronika 
  • vrobavroba Member Posts: 5 Newbie
    Hi Jonas, 

    thank you very much, now it works! 

    Greetings 
    Veronika 
Sign In or Register to comment.