Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Find blank pages in PDF

by Samy_rio (Vicar)
on Jul 04, 2006 at 12:35 UTC ( [id://559153]=perlquestion: print w/replies, xml ) Need Help??

Samy_rio has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks, I need to find the blank pages in PDF. I did super search, but I didn't got any links about this.

I tried in CAM::PDF, in the given pdf file the page is blank but it may contain header information in each page. That is,

03371 _ FM _ i -xv i .qx d 6/28/06 7 : 31 PM P age i

The following code is not displaying blank pages.

use CAM::PDF; my $doc = CAM::PDF->new($ARGV[0]) || die "$CAM::PDF::errstr\n"; my $pages = $doc->numPages(); print $pages; for (1..$pages) { print $_ if ($doc->getPageText($_) eq ''); }

Please suggest me how to find the blank page in PDF?

Regards,
Velusamy R.


eval"print uc\"\\c$_\""for split'','j)@,/6%@0%2,`e@3!-9v2)/@|6%,53!-9@2~j';

Replies are listed 'Best First'.
Re: Find blank pages in PDF
by marto (Cardinal) on Jul 04, 2006 at 13:04 UTC
    Samy_rio,

    Check out CAM::PDF::PageText "Turn a page content tree into a string", which may be what you need to determin if the page has anything on it. Sadly I can not test this for you at the moment due to being at work :(

    Hope this helps

    Martin
Re: Find blank pages in PDF
by starbolin (Hermit) on Jul 07, 2006 at 04:35 UTC

    Your code assumes getPageText() returns an empty string when there are no text blocks in the PDF. This is probably an incorrect assumption. In general, a function in list context could be returning a false (-1), an undef or a string with whitespace. (tab, cr, etc). Try this:

    { my $foo = $doc->getPageText($_) ; print $_ unless (defined $foo && # Returned something and, $foo =~ m/[[:alnum:]]+/ms ); # actually returned text }

    Sorry, I didn't actually test this.

    update: fixed that dratted ~=/=~ update: fixed regex, tested now.

    s//----->\t/;$~="JAPH";s//\r<$~~/;{s|~$~-|-~$~|||s |-$~~|$~~-|||s,<$~~,<~$~,,s,~$~>,$~~>,, $|=1,select$,,$,,$,,1e-1;print;redo}

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://559153]
Approved by prasadbabu
Front-paged by prasadbabu
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (4)
As of 2024-03-28 18:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found