in reply to How to Extract PDF tables using Perl
Hi,
the solution i've seen is to use:
instead of:$doc->getPageContent($pagenum);
$doc->getPageText($pagenum);
But even if the solution sounds simple. There is work for you to do.
You will have to parse the return value of getPageContent.
Here is an Possible Example of PageContent:
9.9213 0 Td Content Tj
The 2 Numbers before the Td tell you the Position of the Content.
UPDATE: This gives you a HashRef of your Page:$doc->getPageContentTree($pagenum)
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^2: How to Extract PDF tables using Perl
by LanX (Saint) on May 27, 2016 at 13:22 UTC | |
by ablanke (Monsignor) on May 27, 2016 at 13:38 UTC | |
by LanX (Saint) on May 27, 2016 at 15:51 UTC |
In Section
Seekers of Perl Wisdom