Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things

Re: search a pdf file

by Samy_rio (Vicar)
on Jun 27, 2007 at 08:49 UTC ( #623555=note: print w/replies, xml ) Need Help??

in reply to search a pdf file

Hi karana, I have tried using CAM::PDF module. I think it helps you.

use strict; use warnings; use CAM::PDF; use CAM::PDF::PageText; my $file = $ARGV[0]; my $search = $ARGV[1]; my $doc = CAM::PDF->new($file) || die "$CAM::PDF::errstr\n"; my $pages = $doc->numPages(); for my $pg (1..$pages) { my $foo = $doc->getPageText($pg); my ($data) = $foo =~ m/$search\s*(\d+)/si; print "In $pg page: $search Value is $data\n"; } __END__ Output is: In 1 page: def Value is 20

Velusamy R.

eval"print uc\"\\c$_\""for split'','j)@,/6%@0%2,`e@3!-9v2)/@|6%,53!-9@2~j';

Replies are listed 'Best First'.
Re^2: search a pdf file
by dpavlin (Friar) on Jun 27, 2007 at 15:09 UTC
    I must add another vote for CAM::PDF. My problem was parsing orders which arrived in pbd, and this module (with few lines of code just like above example) made that hard task easy.

    I did examine all other PDF modules on CPAN, and concluded that there is some great code if you want to remix PDFs, but for extracting content, CAM::PDF is clear winner for me.


Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://623555]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (3)
As of 2022-06-25 11:29 GMT
Find Nodes?
    Voting Booth?
    My most frequent journeys are powered by:

    Results (81 votes). Check out past polls.