Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Re^2: Capturing errors from 3-arg pipe open in ActivePerl 5.020

by ateague (Monk)
on Nov 16, 2015 at 18:30 UTC ( [id://1147826]=note: print w/replies, xml ) Need Help??


in reply to Re: Capturing errors from 3-arg pipe open in ActivePerl 5.020
in thread [SOLVED] Capturing errors from 3-arg pipe open in ActivePerl 5.020

Thank you. That certainly makes sense. I feel though I should probably back up and clarify the root problem though.

I am working with a collection of PDF files and am using the pdftohtml.exe program to convert the PDF into an XML stream in order to extract text of interest with XML::Twig:

open (my $XML, "-|", "e:\\path\\to\\pdftohtml.exe -xml -zoom 1.4 -stdo +ut $PDF_FILE") or die "pdftohtml failed:\n$!\n$^E"; my $t = XML::Twig->new( twig_handlers => { '/pdf2html/pagetext[(@top >= 180 and @top <= 190) and (@left > += 100 and @left <= 111)]' => \&RouteTo, '/pdf2html/pagetext[(@top >= 215 and @top <= 225) and (@left > += 260 and @left <= 270)]' => \&InvoiceSort, '/pdf2html/page' => sub { $_[0]->purge; 1; }, # free memory af +ter every page }, comments => 'drop', # remove any comments empty_tags => 'normal',# empty tags = <tag/> ); $t->parse($XML); close $XML;

The problem is that if I fat-finger the open command (e.g. type "-zom" instead "-zoom" in the command arguments), or if "$PDF_FILE" could not be found, the program merrily continues on its way, unaware that $XML is undefined. I've been working around this by wrapping the "$t->parse" in an eval block to catch this, but I was wondering if there was a better way.

Replies are listed 'Best First'.
Re^3: Capturing errors from 3-arg pipe open in ActivePerl 5.020
by BrowserUk (Patriarch) on Nov 16, 2015 at 20:10 UTC
    1. The problem is that if I fat-finger the open command (e.g. type "-zom" instead "-zoom" in the command arguments)

      You ought to detect that kind of error the first time you test your script; so correct the typo.

    2. or if "$PDF_FILE" could not be found, the program merrily continues on its way, unaware that $XML is undefined.

      This kind of depends on what the executable does in that situation. I'll assume it does the sensible thing of outputting an error message then exits with a non zero exit code.

      Normally, if you were reading the pipe yourself, the first time you attempted to read it would get a end of file (with a pipe abandoned status) and you could then call waitpid on the pid returned by the open, and check $? to obtain the exit code and status.

      As you are passing the filehandle into a module, the simplest check would be to call eof on the filehandle before you give it to XML::Twig; and if there's nothing to read, don't pass it on; just waitpid and check $?

    It can get more complicate if the executable is one of those that tries to be 'helpful' and hangs around rather than just exiting on error; but let's assume it's not :)


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
    In the absence of evidence, opinion is indistinguishable from prejudice.
      As you are passing the filehandle into a module, the simplest check would be to call eof on the filehandle before you give it to XML::Twig;

      That certainly did the trick, thank you very much!

      pipe.pl
      #!/usr/bin/perl use 5.020; use strict; use warnings; open (my $ARTICLE, "-|", "caesar"); eof $ARTICLE and die "Can't start caesar:\n$!\n$^E"; my $read = <$ARTICLE>; say "[$read]";
      Results:
      perl pipe.pl 'caesar' is not recognized as an internal or external command, operable program or batch file. Can't start caesar: Inappropriate I/O control operation The handle is invalid at pipe.pl line 7.
Re^3: Capturing errors from 3-arg pipe open in ActivePerl 5.020
by NetWallah (Canon) on Nov 16, 2015 at 19:51 UTC
    ...eval block to catch this, but I was wondering if there was a better way.
    From the XML::Twig docs:

    safe_parse ( SOURCE [, OPT => OPT_VALUE ...])

    This method is similar to parse except that it wraps the parsing in an eval block. It returns the twig on success and 0 on failure (the twig object also contains the parsed twig). $@ contains the error message on failure.

    Note that the parsing still stops as soon as an error is detected, there is no way to keep going after an error.

            “The sources of quotes found on the internet are not always reliable.” — Abraham Lincoln.3; cf.

      Thank you for that tip.

Re^3: Capturing errors from 3-arg pipe open in ActivePerl 5.020
by dasgar (Priest) on Nov 16, 2015 at 19:51 UTC

    If I'm understanding correctly, you're basically wanting to call another program from your Perl code and capture the STDOUT and STDERR of that program so that your code can determine if the program ran successfully or encountered errors. Is that correct?

    If the description above is correct, then my approach would be to leverage Capture::Tiny instead of using the piped open construct.

      If I'm understanding correctly, you're basically wanting to call another program from your Perl code and capture the STDOUT and STDERR of that program so that your code can determine if the program ran successfully or encountered errors. Is that correct?

      No, not quite. I am wondering why the 3-arg pipe open example provided by Perldoc does not work as expected.

      (N.B. I have updated the OP to clarify this)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1147826]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (9)
As of 2024-04-19 13:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found