"extract data from a table in a pdf-file"

currantcurrant Member Posts: 14 Contributor II
edited May 2019 in Help
Hi All,

is it possible to extract the data from a table in a pdf-file with RM? If yes, how? Has anyone some experience? Do I need xpath-experience ...?

Thanks in advance!



  • Options
    MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Sorry, this is not possible with RapidMiner. However, the Text Processing extension offers you some operators to read pdf files as normal text. Select "Update RapidMiner" from the Help menu to download and install it.
    Depending on the contents and the formatting of the tables it might be possible to copy-paste the tables by hand from the pdf file to an excel sheet and use the Read Excel operator.

    Cheers, Marius
Sign In or Register to comment.