http://qs321.pair.com?node_id=367587

In the thread at Copying a file to a temporary file, I made the (embarrasing) assumption that rename $oldname, $newname; would fail if a file called $newname already existed.

It came as a complete shock to me that perl's built-in rename will silently delete the existing file if $newname exists.

Two questions

  1. Is this unprecedented for a rename command or function?
  2. Does anyone else think that this is a DWIM too far?

Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
"Memory, processor, disk in that order on the hardware side. Algorithm, algoritm, algorithm on the code side." - tachyon

Replies are listed 'Best First'.
Re: A DWIM too far?
by mpeppler (Vicar) on Jun 17, 2004 at 12:21 UTC
    It's the expected behavior, taken from the C rename(2) call. The perlfunc pod also cautions that the behavior of perl's rename depends on the system where you are running it, and refers to rename(2) documentation.

    Michael

Re: A DWIM too far?
by thor (Priest) on Jun 17, 2004 at 12:31 UTC
    What happens if, under Unix, you do the following: "touch temp1 temp2; mv temp2 temp1"? You'll be left with only temp1. Completely reasonable behavior, I think.

    thor

      Some people may find "reasonable behavior" and "what Unix does" not entirely synonymous in all situations ;-)

        BLASPHEMER! Ladies, get your beard and rocks, it's time for a stoning!

        </OB_Python>

        --
        tbone1, YAPS (Yet Another Perl Schlub)
        And remember, if he succeeds, so what.
        - Chick McGee

        I know that your tongue was in your cheek, but I don't need to take off my shoes to know that Unix has been around for longer than Windows has with fewer changes to basic things such as this. While this is a poor measure of correctness, I think it says something.

        But, as with all things Unix, if you want the hand holding, you can have it. Change 'mv' to 'mv -i' in my example and it won't clobber the file without asking first. Windows and the like assume that you're stupid and that they know better. I don't like that one bit.

        thor

        A reply falls below the community's threshold of quality. You may see it by logging in.

      "touch temp1 temp2; mv temp2 temp1"? You'll be left with only temp1.

      No doubt that is true on your system but on RedHat flavour Linux you will get a prompt before the overwrite:

      [root@devel3 root]# touch temp1 temp2; mv temp2 temp1 mv: overwrite `temp1'? n

      cheers

      tachyon

        I think you'll find that this happens only because RedHat sets up by default an alias (for root only) that rewrites 'mv' to 'mv -i'.

        Hugo

Re: A DWIM too far?
by grinder (Bishop) on Jun 17, 2004 at 14:22 UTC
    Is this unprecedented for a rename command or function?

    Unprecedented, no. This is the way Unix has always behaved. You have to explicitly ask it to warn you if things are about to be overwritten, via the -i command line switch. On the other hand, I know you happen to be (mainly) a Windows developer, and on that platform you can't rename an old file into an existing file.

    And it does say here "Changes the name of a file; an existing file NEWNAME will be clobbered. "... but who reads the documentation?

    Does anyone else think that this is a DWIM too far?

    No, in that I am accustomed to the behaviour and in fact rely on it. I appreciate being able to rename a new file into the name of a production file and I don't care whether the file exists or not.

    Nonetheless, you do have a point. According to the principle of Least Surprise, perl should probably honour the underlying rename behaviour of the OS. But that opens up another can of worms. I have lots of cross-platforms scripts written once, run on both Windows and Unix. What happens then? I like having the consistent behaviour on both systems.

    Maybe what is needed is a flag in Config that lets one choose which action to take when the target exists.

    PS: now, about those backups...

    - another intruder with the mooring of the heat of the Perl

      On the other hand, I know you happen to be (mainly) a Windows developer, and on that platform you can't rename an old file into an existing file.

      My recollection is that in the first "Windows" (aka "DOS"), you couldn't prevent ren nor copy from overwriting files (I recall being quite frustrated at not having "-i" and losing files as a result).

      In modern Windows (aka "Win32"), you are correct that "ren" refuses to overwrite files. However, "copy" and "move" both have the most sensible default IMO, they ask before overwriting. Though I find it a bit strange that "ren" refuses to overwrite files but "copy" and "move" can't be made to act the same. You can prevent "copy" and "move" from asking ("/y"), but if you do, they unconditionally and silently overwrite.

      As for programming APIs, my preference would be to allow the programmer to specify. Actually, I'd require the programmer to specify whether clobbering should done or an error should be returned. The API should allow this to be specified so that the operation can be done atomically. And I think no default behavior should be specified just to force the programmer to consciously make that decision, as there are cases where either behavior is inappropriate.

      - tye        

Re: A DWIM too far?
by demerphq (Chancellor) on Jun 17, 2004 at 14:34 UTC

    Personally i think the behaviour is huffman encoded properly. IME its much more common to want to rename things and overwrite what was there before than it is to rename something without overwriting. So to question 2 I dont think this is DWIM too far, I think its more like the only right way to DWIM. :-) As for question 1, others seem to have clarified that.


    ---
    demerphq

      First they ignore you, then they laugh at you, then they fight you, then you win.
      -- Gandhi


Re: A DWIM too far?
by fletcher_the_dog (Friar) on Jun 17, 2004 at 14:28 UTC
    Whether getting an error or not is good is questionable, but the behavior is very consitant with unix behavior. For example "cp file1 file2" will overwrite "file2" if it already exists with nary a complaint. Windows, on the other hand, polices your renaming, copying, and removing of files with sometimes helpful and sometimes annoying messages. Maybe you could make a module called WindowsLike::File::Functions that imported a copy, rename, and unlink function that gave you errors when you try to mess with already existing files.
Re: A DWIM too far?
by FoxtrotUniform (Prior) on Jun 17, 2004 at 23:48 UTC

    Given that we've been over the rename(2) behaviour-copying and so on, I thought I'd hijack part of this thread for a more abstract discussion:

    What's better, familiar behaviour or intuitive behaviour?

    I first came across this argument in Maguire's Writing Solid Code book. His example (and mine) was a wrapper for C's malloc(3) and friends, for instance, to detect memory leaks. You can call stdlib realloc with a NULL pointer, in which case it acts like malloc, or with a zero size, in which case it acts like free. Maguire argues that in almost every case, NULL-pointer or zero-size calls to realloc are errors rather than clever tricks. Accordingly, he removed this behaviour from his realloc wrapper. I did the same when I wrote a realloc wrapper, for the same reasons.

    In general, I tend to think that duplicating broken behaviour in the interests of familiarity is a bad idea. The question, then, becomes: "Is rename's behaviour broken, or just unexpected?"

    --
    F o x t r o t U n i f o r m
    Found a typo in this node? /msg me
    % man 3 strfry

      I'm not sure which I find the most bizarre

      1. That the silently destructive behaviour was ever adopted in the first place.
      2. That such a (IMO) flawed behaviour should have been perpectuated.
      3. Or the outpouring of support for it.

      As is, the behaviour makes it impossible to safely rename a file.

      You can test existance before issuing the rename, but in any multi-tasking environment there is always the possibility that the target file will be created between the test for existance and the rename. This renders the the often hard won (and IMO, sacrosanct) atomicity of the OS rename API useless.

      This rates up there with non-exlusive-opens-by-default and cooperative locking as remnant bahaviours of a bygone era that should have been superceded long ago.


      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "Think for yourself!" - Abigail
      "Memory, processor, disk in that order on the hardware side. Algorithm, algoritm, algorithm on the code side." - tachyon
        As is, the behaviour makes it impossible to safely rename a file.

        That depends on your definition of "safely". The other possible default (not renaming if the target file exists), also prevents "safely" renaming a file. Because that will prevent the to-be-renamed file to be renamed. So that's also "unsafe".

        You should also consider what should happen in the following situation:

        $ touch file $ ln -s target link $ rename file link
        Rename or not?

        Note that if you want to do a "safe" rename, you can do so, although it involves a copy:

        my ($f1, $f2); if (sysopen $f1, "new", O_WRONLY | O_CREAT | O_EXCL) { open $f2 => "old" or die; print $f1 <$fh2>; close $f1 or die; unlink "old" or die; }
        You can test existance before issuing the rename, but in any multi-tasking environment there is always the possibility that the target file will be created between the test for existance and the rename.
        Frankly, I don't get your point. What are you trying to say here? That the rename is making you lose data? That I disagree with. Suppose you are multi-tasking, one threads wants to create a file "new", and another thread wants to move "old" to "new", but you want to end up with both the data that is in "old", and with the data that would be placed in "new" by the other thread. Suppose you would have a 'rename' that doesn't rename if there is already a "new" file. Does that win you anything? Not if the renaming process thread goes first. Then "old" will be renamed to "new", and then wiped out by the other thread that's creating "new". Rename is not to blame in this scenario - the programmer is to blame by not syncronizing two threads that modify the same resource.
        This rates up there with non-exlusive-opens-by-default and cooperative locking as remnant bahaviours of a bygone era that should have been superceded long ago.
        Well, you can always change your system calls. Oh, wait, you can't.

        Abigail

        As is, the behaviour makes it impossible to safely rename a file.
        It makes it harder, at least:
        sub myrename { my $rename = ( $^O eq 'MSWin32' ) ? 'move -y' : 'mv -i'; `$rename @_[0,1]`; return $!; } myrename('abc','def') or die $!;
        I can't recall how to do this to the CORE, perhaps someone will remind me?

        -QM
        --
        Quantum Mechanics: The dreams stuff is made of

        Here's something else you might find bizarre. There's a programming language many people greatly enjoy using that has very little stricture by default, lots of similar but slightly different ways to do the same thing, an object system that lets you use whatever type of datastructure you damn well please, and generally encourages a culture of "stay out of my living room because I ask, not because I have a shotgun." Perhaps you find it bizarre, but I find it downright refreshing.

        I feel the same way about cooperative locking, non-exclusive opens, and -- to a far lesser extent -- the overwriting rename(). Do I wish there was a reasonable way to rename without clobbering-by-default? Yes, I do. That doesn't make me like the current behavior any less. Personally, I like a platform that lets me do what I want, even if what I want might be shooting myself in the foot. If this mentality is so strange to you, I wonder how exactly you've managed to stick with Perl as long as you have.

      What's better, familiar behaviour or intuitive behaviour?

      Which is bluer, blue or blue?

Re: A DWIM too far?
by graff (Chancellor) on Jun 18, 2004 at 02:14 UTC
    Oh yeah... This is on a par with unlink() being the equivalent of unix  rm -f (i.e. "don't ask questions, just delete the file(s)!") -- and when you combine this with an all-too-common misunderstanding about unix file permissions, you really see how unix gets its reputation as being "not for the feeble-minded or faint-hearted".

    The misunderstanding involves setting individual file permissions to "read-only", and thinking that this protects the file from being deleted. It doesn't. The file is protected from having its contents altered by being opened for write access, and that's it.

    Now, if the directory containing the file is also set for "read-only" access, then the file is safe, but so long as there is write access on the directory, the file can be deleted (or another file can be renamed to displace/obliterate it).

    The default behavior of unix "rm" involves always asking for interactive confirmation before deleting a file that is set for read-only access, but "rm -f" is always available to bypass that safeguard (e.g. when running in a script or makefile), and this is what Perl's "unlink()" does.

    I'm not saying it's good or bad for it to be this way. Just like a chainsaw is not intrinsically good or bad... It simply requires care and respect.

      I agree, though I was less concerned with the behaviour of system utilities. I can always adapt those using aliases or bat files or whatever.

      And I'm rarely in favour of "do you really want to do what you have just asked me to do" prompts, especially the accursed pop-up variety.

      But when it came to Perl overriding the standard behaviour of a usually safe (on my OS) API to provide compatibility with a potentially destructive (and IMO, questionable) behaviour of a.n.other OS, let's just say it didn't comply with my idea of 'least surprise'.

      Perhaps the critisisms of "who reads the documentation anyway" are valid here. I never read the docs for rename simply because I didn't think I needed to. I just cannot see the circumstance where rename failing because the target file exited would ever be a burden. If I know that the file might exist and that I want to overwrite it, then I just attempt to delete it first.

      I realise that this would be non-atomic. That there is a chance in a multi-tasking system that another process could re-create the deleted file between the delete and the rename, and the rename would then again fail. But so what? I cannot percieve of any circumstance where the unix behaviour would be the "right thing" in this situation.

      I'm not sure if the unix destructive rename is atomic at the syscall level or not, but there are two possiblilities:

      1. The delete/rename is atomic.

        If true, then once my application has renamed the file, then the other application that was trying to create the file will either

        • Succeed and overwrite my newly renamed data.
        • Fail if he bothered to use an deny-shared open mode.
      2. The delete/rename is not atomic.

        If true, the other app could potentially re-create the deleted file prior to the rename? What then? Does the rename then fail?

      I realise that well-written apps that use sensible choices of share flags and/or file permissions can work around this, but it still seems a strange choice of default behaviour.


      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "Think for yourself!" - Abigail
      "Memory, processor, disk in that order on the hardware side. Algorithm, algoritm, algorithm on the code side." - tachyon

        FYI, POSIX rename() is atomic. Perl's rename is based on POSIX rename() but Perl may use a different API when compiled on a system that doesn't (appear to) have POSIX rename() (hence the weasel words about it perhaps not silently clobbering). Therefore, Perl's rename will not be atomic on some non-POSIX systems.

        I can't think of any APIs that Perl has taken from Win32. Where a Perl API matches a Win32 API it is almost always because the Win32 API was made to match the POSIX (or pre-POSIX) API. While I concur with your preference to "use the safer of the two" when selecting, I'd certainly be more surprised to find Perl using a Win32 API that conflicts with a POSIX API.

        BTW, I wish Perl's system() had asynchronous launching built in so that could be done portably. It is rather ironic that asynchronous launching is built into the under-the-covers API that Perl uses but it doesn't expose this nice feature in the language -- well, on nearly OS/2 systems (including Win32) you can use system(1,...) to launch a command into the background, but that isn't portable and making that API choice portable poses problems on less OS/2ish systems like Unix where commands are passed a list of arguments instead of a command line.

        - tye