Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re^3: avoiding a race ("No extra", "no-user" locking--miniscule race of no importance)

by BrowserUk (Patriarch)
on Sep 28, 2010 at 17:28 UTC ( [id://862474]=note: print w/replies, xml ) Need Help??


in reply to Re^2: avoiding a race (does lock, still racy)
in thread avoiding a race

That is not a non-locking mechanism. It just hands off the locking to the kernel which locks the directory when it reads from it or writes to it. It has the advantage of the kernel locking implementation being very well tested.

The kernel is going to do it's locking whatever file operations you do. Re-using it is good.

So, I guess you could call it a "no-extra, no-effort(or risk of getting it wrong)" locking mechanism.

Of course, errors might not all have such nice, unique, numeric identifiers ...

If you can't reduce the errors to something easily comparible in the filesystem, you'll have similar problems locating similar errors in the file itself. And globbing is capable of much more that just "string equality".

But, most importantly, your solution (as described) has a race condition between stat and creating a file. You can probably fix that a couple of different ways.

If I knew how to do open(CREATE_NEW(*)) in Perl, I would suggest that. If the open() fails, it must have 'just' been created, so there's nothing else to do, so you just move on anyway.

But realistically, it's probably a "problem" not worth the effort of solving. The idea is to avoid 300 emails. Getting 2 or even 3 shouldn't be a problem.

Update: The "race condition", whether this process creates a new file; or some other process does it for you a few milliseconds before you do, doesn't trigger extra emails.

Nor does it delay their being sent at the appropriate time. the time window is probably less than the resolution of the file system timestamps. So. NO race condition!

Very simple. Very effective. Perfection is the enemy of "good enough".

(*)Ie. Create a new file; fail if it already exists.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
  • Comment on Re^3: avoiding a race ("No extra", "no-user" locking--miniscule race of no importance)
  • Download Code

Replies are listed 'Best First'.
Re^4: avoiding a race ("No extra", "no-user" locking--miniscule race of no importance)
by Kanji (Parson) on Sep 28, 2010 at 20:17 UTC
    If I knew how to do open(CREATE_NEW(*)) in Perl,

    Assuming NFS isn't involved, you could do that with sysopen (or IO::File):-

    use Fnctl; # O_* constants sysopen(FH, $file, O_CREAT|O_EXCL);

        --k.


Re^4: avoiding a race (much ado)
by tye (Sage) on Sep 29, 2010 at 19:05 UTC

    Actually, the Linux kernel holds a mutex (exclusive not shared lock) against the directory during readdir (for example) so re-using the kernel locking means that the processes must do their searching for a matching error (file) in single-file. This is probably part of why directories containing a large number of files are notoriously extremely slow in Unix. (And Perl's glob / readdir is notorious for at least sometimes being pathologically slow under Windows but I don't know what locking Windows uses for directory operations.)

    And so the "no extra locking" claim is completely bogus. The extra locking case is "a ton of kernel mutex uses" not the "small number of kernel mutex uses to open a file and then get one shared lock".

    I don't see how you justify that the race doesn't result in extra e-mails. I believe your analysis is mistaken there. And you may not care about a few extra e-mails (or tons of extra e-mails when the directory gets bloated and pathologically slow to use) but the person asking the "avoiding a race" question probably does. I can certainly understand caring about my boss getting duplicate e-mails after he assigned me the task of making sure we don't get duplicate e-mails.

    But given 300 processes, I'd probably go with a separate, single process that de-dups errors rather than having 300 processes fighting over the list of errors (whether stored as lines in a file or files in a directory).

    - tye        

      I don't see how you justify that the race doesn't result in extra e-mails. I believe your analysis is mistaken there.

      Hm. The window your "still racy" referred to, was the time between a failed stat, and open. Ie:

      open ERROR, '>', $file unless -e $file; close ERROR;

      And even in a full directory and an a loaded system, that time is going to be measured--assuming you can actually measure it at all--in low milliseconds at the most.

      Now, what does that actually mean?

      It means that one of the other processes encountered the same error as you, and succeeded in creating the error file within those few milliseconds. So what?

      You then immediately overwrote it with a later time-stamp. But still, so what?

      Nothing! Because the error file got created. Nothing is going to take any action--like sending emails--as a result of that files creation for another hour. From the OP:

      if it has been encountered before and the time stamp is greater than an hour ago it will mail the admin

      So the very worst affect of some other process creating the file instead of you, is that the sending of the email is delayed by the difference between the original time-stamp, and the new one. And that's just a few millseconds at most.

      #! perl -slw use Time::HiRes qw[ time ]; use threads qw[ stack_size 4096 ]; my $file = 'theFile'; async{ 1 until -e $file; }->detach for 1 .. 300; sleep 3; my @times = time; unless( -e $file ) { push @times, time; open FILE, '>', $file or die "$file : $!"; push @times, time; close FILE; } push @times, time; print for @times; unlink $file; __END__ [20:51:05.40] C:\test>junk49 1285789900.703 1285789900.87509 1285789901.02394 1285789901.324

      With 300 clients stating theFile (in a directory containing 1000 files), the window of opportunity for this irrelevant race condition is all of 300 milliseconds.

      And then only if the time-stamp resolution of the file-system is sufficient to actually discern the difference, which is unlikely.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        And even in a full directory and an a loaded system, that time is going to be measured--assuming you can actually measure it at all--in low milliseconds at the most.

        No, not at all. Opening the file requires finding the file which requires traversing the (possibly long) directory contents yet again (and thus contending with all of the mutex contention again also). With NTFS or a newer Linux file system (with the proper options enabled), then the directory won't be stored as a simple list and the performance is probably not as easily pathological. A few months ago I again ran into a directory with way too many files it in and it took many seconds, even minutes, to open a file (or to remove one). I haven't tried to replicate the problem on a more modern filesystem to see how well it scales. But I suspect there are plenty of file systems left in the world that were built without hash/tree directories.

        And then only if the time-stamp resolution of the file-system is sufficient to actually discern the difference, which is unlikely.

        And there you have your broken analysis, again. If X and Y fail to find 'file1' and then both create it, then the fact that the timestamp is not changed by whichever attempt is second has no bearing on the fact that both X and Y will then go on to send an e-mail. (Or, you can remove the race.)

        - tye        

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://862474]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (6)
As of 2024-04-16 11:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found