Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re^2: Read table data from PDF

by ateague (Monk)
on May 11, 2016 at 13:59 UTC ( [id://1162760]=note: print w/replies, xml ) Need Help??


in reply to Re: Read table data from PDF
in thread Read table data from PDF

I made good experiences by using an external pdf2txt-converter and the parsing the output - but this of course depends on your input-document.

As a side note, if you go down this route, make absolutely certain that your external program will extract the text with some sort of X/Y position.

Unless you have full and complete control over the PDF and its generation, parsing PDF text by fixed position row/column is pretty much guaranteed to end in failure, frustration, and an absolutely massive nest of exceptions and special parsing cases

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1162760]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (5)
As of 2024-04-18 01:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found