PDF Table extraction into data
The Extract PDF Tables seems to be a relativly new extension and I do not see much discussion around it.
I have multiple PDF documents from which I need to extract the data contained in the tables. The output of this operator is an IO Object collection. Due to the fact that there are tables within tables, it means that there is not a uniform output in the example sets and as a result I am unable to use the append operator.
I am also stumped as to how to convert this collection into data so that I can use the other available opperators to clean it.
What is the best practise in terms of using this operator?
@sgenzer do I get a free shirt?