Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

Use pdf file name as attribute

vrobavroba Member Posts: 5 Learner I
Hello everyone :smile:

I want to do some simple Text Mining using pdf files in RM but I'm a little stuck right now. 

I created a process using the loop files and process document operator for reading in several pdf files.
As I have a lot of files to analyze, which I also want to compare, I would like to create an attribute which includes the file name to keep track of everything. 

I enabled macros and tried to include the file name by generating a new attribute.
The problem is that the generated attribute only consists of the file name of the last file I uploaded and not the name of the corresponding document. How can I ensure that the attribute value is the respective file name of the document?  

Or is there a way to just include the metadata_file as an attribute? 

I included my process and the first 5 files I want to read.
I would really appreciate every help, thank you already in advance! 

Best Answer

  • jwpfaujwpfau Employee, Member Posts: 303 RM Engineering
    Solution Accepted
    Hi Veronika,

    yes, you can select Process  → Synchronize Meta Data with Real Data.

    But then you have to run it once to populate the Meta Data.

    Greetings,
    Jonas

Answers

  • jwpfaujwpfau Employee, Member Posts: 303 RM Engineering
    Hi,

    couldn't you throw out the surplus metadata attributes with

    Select Attributes
    type exclude attributes
    attribute filter type: subset
    select subset: select the metadata fields that you don't need

    Greetings,
    Jonas
  • vrobavroba Member Posts: 5 Learner I
    Hi Jonas, 

    thank you for your answer! 

    I'm not sure what exactly you mean, because the metadata attributes don't show up in the select attributes operator. 
    Is there a way to turn metadata into "real" data? 

    Greetings
    Veronika 
  • vrobavroba Member Posts: 5 Learner I
    Hi Jonas, 

    thank you very much, now it works! 

    Greetings 
    Veronika 
Sign In or Register to comment.