Lawliet has asked for the wisdom of the Perl Monks concerning the following question:
To put it bluntly, I need to extract data from a pdf file.
More specifically; inside this two-page pdf file lies a 2-3 (it changes) column, multi-row table. Despite the oddly formatted table (you would have to see the document to understand what I mean, I guess), I believe I can parse it given the right module. The only one I see that may help is CAM::PDF. Do you know of anything that is more helpful for parsing pdf tables? Should I convert it to separate file format and go from there?
Update: Decided to just convert it to an html document, (thanks, Popcorn Dave), but thanks to all who helped. I am still willing to listen to any further suggestions if you have them, though.
I'm so adjective, I verb nouns!
chomp; # nom nom nom
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Extracting information from a PDF file
by Perlbotics (Archbishop) on Aug 20, 2008 at 21:59 UTC | |
by Lawliet (Curate) on Aug 20, 2008 at 22:08 UTC | |
by Perlbotics (Archbishop) on Aug 20, 2008 at 22:46 UTC | |
by Lawliet (Curate) on Aug 20, 2008 at 22:54 UTC | |
Re: Extracting information from a PDF file
by Popcorn Dave (Abbot) on Aug 20, 2008 at 20:31 UTC | |
by Your Mother (Archbishop) on Aug 20, 2008 at 22:30 UTC | |
by Lawliet (Curate) on Aug 20, 2008 at 20:35 UTC | |
by Popcorn Dave (Abbot) on Aug 20, 2008 at 22:26 UTC | |
by Lawliet (Curate) on Aug 20, 2008 at 22:29 UTC |