Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

I'm very much enjoying reading Perl Best Practices. I find myself scribbling notes after almost every page; mainly concrete ideas for improving the quality and especially the maintainability of production Perl code at work. To me, this is the most important Perl book to be published for years, because it helps me sell Perl as a maintainable language to management.

However, the In-situ Arguments ("Allow the same filename to be specified for both input and output") practice described on page 304 in chapter 14 has me scratching my head, for it seems to me to be more a "dangerous practice" than a "best practice".

Here is a test program, derived from the example given in the book:

# The idea is to use the Unix unlink trick to write the # destination file without clobbering the source file # (in the case where the source and destination are the same file). use strict; use warnings; my $source_file = 'fred.tmp'; my $destination_file = $source_file; # Open both filehandles... use Fatal qw( open ); open my $src, '<', $source_file; unlink $destination_file; open my $dest, '>', $destination_file; # Read, process, and output data, line-by-line... while (my $line = <$src>) { print {$dest} transform($line); } # This is my test version of the transform() function; # the sleep is there for convenience in testing what happens # if you interrupt proceedings mid stream by pressing CTRL-C. sub transform { sleep 1; return "hello:" . $_[0]; }

My problems with this code are:

  • Consider what happens if the while loop is interrupted: by power failure, by user pressing CTRL-C, or because the print fails (due to disk full or disk quota exceeded, say). You've just corrupted your file. You've probably lost data. Worse, you don't know you've done it. And when you go to re-run the script after the interruption, you may spend a lot of time trying to figure out what's happened to your data ... That is, this idiom is not "re-runnable".
  • The unlink trick used to avoid clobbering the input file works on Unix, but may not work on other operating systems. In particular, when run on Windows, the above program clobbers the source file. That is, this idiom is not portable.

As discussed in Re-runnably editing a file in place, it seems sounder to first write a temporary file. Once you're sure the temporary file has been written without error (and after the permissions on the temporary are updated to match the original) you then (atomically) rename the temporary file to the original. In that way, if writing the new file is interrupted for any reason, you can simply re-run the program without losing any data.

Please let me know what I've overlooked.

In reply to Perl Best Practices book: is this one a best practice or a dodgy practice? by eyepopslikeamosquito

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others pondering the Monastery: (3)
    As of 2020-11-30 02:44 GMT
    Find Nodes?
      Voting Booth?

      No recent polls found