http://qs321.pair.com?node_id=64256

DeusVult has asked for the wisdom of the Perl Monks concerning the following question:

I have been cursed with the task of adapting...<shudder>...legacy code. I have many problems, but I'll only ask for your kindly help on one at a time.

Inescapable Rule of Programming #1: before one can hope to debug code, one must understand what it does.

Problem the first

Context: there is a database, and this script reads a text file which describes the sorts of changes that need to be made to the DB (done thankfully, through a separate API that I need not concern myself with).

Every now and again in the program, there appears a line like this: if (-e $filename) { unlink ($filename); }

Try as I might, I can find neither rhyme nor reason to these lines. Unfortunately, I am additionally hampered by a lack of perl reference manuals and a broken perldoc! So all I have at my disposal is my memory and, of course, perlmonks :)

What confuses me is that I thought -e checked for existence. So that line of code reads "if the file exists, unlink it."

What really confuses me is the call to unlink. I'm not entirely sure what it does, but it sounds sinister. Why this is so confusing is that the files that are being so wantonly unlinked are fairly important data files necessary for the execution of the script (they describe the necessary DB changes), so it seems to me odd that the script would be doing something untoward to them. From a little PM supersearching, I found that unlink seems to have something to do with symbolic links, but that makes no sense in this context because the files in question are not symbolically linked to anything!

So my fellow monks, my questions for you are these:

  1. Is my understanding of -e correct?
  2. What exactly does unlink do, and is it as viscious toward the target file as it sounds?
  3. Taken as a whole, what does that if statement do?
  4. From a philosophical standpoint, can anyone think of a reason why someone might want to do whatever that is?

As always, my thanks go out to the selfless denizens of perlmonks.

Some people drink from the fountain of knowledge, others just gargle.

Replies are listed 'Best First'.
Re: -e and unlink
by Masem (Monsignor) on Mar 14, 2001 at 03:36 UTC
    You've basically answered most of your questions: -e does check for existence, and 'unlink' is pretty much the same as doing a unix 'rm <file>'. (Unlink will not remove directories, however). So, to 3), the line checks to see if the file is there, and deletes it if it does.

    Why would you use this? Obviously, I'd not use it for data files, but I would use it for lock files which might or might not be created, using such to clean those lock files away at the end of a program. Or, if I was about to write data to a temporary file before moving the temp file in place of another, I'd make sure that tmp file wasn't there before starting. In the case of the DB transactions, maybe the transactions are written to temporary files until the transactions are completed (as such in case of program failure and a COMMIT is not given to the database), and then when COMMIT is given, the changes are now unnecessary, and thus removed. But there's a number of possible reasons why that line is often used in your program, and without further context, it's hard to speculate.


    Dr. Michael K. Neylon - mneylon-pm@masemware.com || "You've left the lens cap of your mind on again, Pinky" - The Brain
Re: -e and unlink
by stephen (Priest) on Mar 14, 2001 at 03:51 UTC
    If you have access to Perlmonks, you have access to a manual. perlman:perl. All the manpages are here.

    Also, you might want to check out Martin Fowler's excellent book on dealing with legacy code, Refactoring. Basically, you can't fix code until you understand it, and some legacy code is not comprehensible as written. Refactoring allows you to gradually improve existing code in small steps, checking along the way to make sure you don't break existing functionality.

    stephen

Re: -e and unlink
by TStanley (Canon) on Mar 14, 2001 at 03:41 UTC
    To answer your questions:
    1. Your understanding of the -e switch is correct. It checks for the existence of a file.
    2. According to Programming Perl, 3rd Ed, the unlink function deletes a list of files,
    so it does exactly what you think it does.
    3. The statement will look for $filename, and delete it if the statement is true.

    TStanley
    In the end, there can be only one!
Re: -e and unlink
by japhy (Canon) on Mar 14, 2001 at 03:47 UTC
    The condition of that if statement is pointless. You can't delete a file that doesn't exist.
    unlink $filename;
    is fine.

    japhy -- Perl and Regex Hacker
      japhy's reply may be worth expanding on.

      IF you don't care about reporting errors, then     unlink $filename; is sufficient.

      IF you DO care about reporting errors (such as a failure to remove a file that your script lacks permission to remove), then you'll want to do something like:

      if ( -e $filename ) { unlink($filename) or die "$filename: $!" }
        But you can still avoid the -e test:
        unlink $filename or $! == 2 or die "Cannot unlink $filename: $!";
        On my system, "2" is the code for a missing file for unlink. If you don't mind the regex hit, you could also use:
        unlink $filename or $! =~ /no such/i or die "Cannot unlink: $filename: + $!";

        -- Randal L. Schwartz, Perl hacker


        update:

        This is actually faster than a separate test and unlink (one O/S call instead of two), and is less prone to race conditions. I'd actually scream at someone who did a -e/unlink pair instead of this in a security or performance related application.

Re: -e and unlink
by AgentM (Curate) on Mar 14, 2001 at 03:54 UTC
    Keep in mind that as long as a file descriptor is open on a certain file, it is not removed. The program may still legally use the file to read or write from. Only when the last file descriptor is closed, does the file really disappear (unless you have cool retrieval software or some machines that support MRI.) Be careful- the file still may be in use after unlinking. This is most often done with temporary files.
    AgentM Systems nor Nasca Enterprises nor Bone::Easy nor Macperl is responsible for the comments made by AgentM. Remember, you can build any logical system with NOR.