Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Semaphore puzzle

by jerryhone (Sexton)
on Feb 22, 2021 at 18:12 UTC ( [id://11128659]=perlquestion: print w/replies, xml ) Need Help??

jerryhone has asked for the wisdom of the Perl Monks concerning the following question:

Brothers,
I'm seeking wisdom and potentially flashes of inspiration... I have a process that receives files via NDM from an external partner in pairs - a data file and a status file. The files may arrive in any order, and could be anything from minutes apart to fractions of a second. I'm trying to create a process that populates the file content into an Oracle database via a Perl script triggered after each file arrives. Although the files can arrive in pseudo parallel, I need to process them in sequence, so I'm trying to semaphore lock them (IPC::Semaphore). If they arrive separated by, say, 1 second the lock works nicely, but if they arrive a tenth of a second apart, both Perl processes say that they're the first one and create and initialize the semaphore.
$sem = IPC::Semaphore->new( 4321, 1, S_IRUSR | S_IWUSR ); if ( $sem ) { # Semaphore already exists so just open it print "Semaphore already exists - just open it\n"; $sem = IPC::Semaphore->new( 4321, 1, S_IRUSR | S_IWUSR ); } else { # The semaphore didn't already exit so create it print "Create semaphore \n"; $sem = IPC::Semaphore->new( 4321, 1, IPC_CREAT | S_IRUSR | S_IWUSR + ); print "Semaphore created\n"; $sem->setval(0,1); print "Semaphore initialised\n"; } print "Locking other threads\n"; $sem->op(0, -1, SEM_UNDO);
Depending on exact timing, I see that it's possible for the first process to create the semaphore and attempt to lock the other process, but the other one running a fraction behind has not detected the creation so it does its own and sets the semaphore to 1, so revoking it's partner's lock! I can't see a fool proof way of getting around this, so any divine inspiration gratefully received.
Jerry

Replies are listed 'Best First'.
Re: Semaphore puzzle
by choroba (Cardinal) on Feb 22, 2021 at 23:02 UTC
    The race condition is not that the second process doesn't detect the creation. It in fact runs the check before the first process has created the semaphore, but after it checked its existence (and didn't find it). You need to create the semaphore right ahead, if the creation failed, you know you are the second one.

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
      Thanks for the suggestion. I have already tried an exclusive creation (IPC_CREAT | IPC_EXCL) which I understand is what you're suggesting. I hit the same issue...the files arrive too closely together that neither create process detects that the semaphore has already been created so they both create successfully!

        I think choroba was suggesting this (untested):

        $sem = IPC::Semaphore->new( 4321, 1, S_IRUSR | S_IWUSR | IPC_CREAT | I +PC_EXCL ); if ( $sem ) { # New semaphore print "Semaphore created\n"; $sem->setval(0,1); print "Semaphore initialised\n"; } else { # Semaphore already exists so just open it print "Semaphore already exists - just open it\n"; $sem = IPC::Semaphore->new( 4321, 1, S_IRUSR | S_IWUSR ); + ); } print "Locking other threads\n"; $sem->op(0, -1, SEM_UNDO);

        But there is still a race condition, where the second process can attach to and use the semaphore after the first process creates it and before the first process initializes it.

Re: Semaphore puzzle
by jcb (Parson) on Feb 23, 2021 at 02:29 UTC

    The better answer that I have used in the past for loosely-coordinated multi-processing scripts is to use the filesystem. I will presume that neither the data nor status file reliably arrives first, so we must detect when a pair of files is present, lock the pair, and then proceed with processing.

    The simple solution (assuming all workers are running on the same node, as network filesystems can screw this up) is to create a third "flag" file, using sysopen with O_CREAT|O_EXCL from Fcntl, the successful creation of which acts as acquiring a lock for that pair. A process that fails to acquire this lock simply moves on to the next pair, confident that another worker has already claimed that pair.

    There is a small potential problem here with stale flag files, but that can be remedied by either shutting the worker horde down and cleaning up any flags left (as I have used in the past when I needed this for a quick-and-dirty multi-process solution) or writing the PID into the flag file to allow a single cleanup process to remove any flags left by workers that are no longer running. When cleanup removes a flag, another worker will eventually find that pair and process it. You will need to manually supervise this kind of operation, because the most likely reason for a worker to fail to complete processing is that that pair exposes a bug in the worker code and will reliably crash the worker process.

Re: Semaphore puzzle
by Corion (Patriarch) on Feb 23, 2021 at 09:12 UTC

    Maybe I'm misunderstanding the problem. Why don't you lock one of the files or the database table you're importing into (or both)? You have a defined order in which they should be imported, so the program that gets launched to import the (defined) first file locks the file (and the table). The second program launched finds the file/table locked and just exits.

      It's all down to timing. There isn't the time to implement a lock (or a semaphore) before the other file is returned.
      I've reconsidered a complete redesign based on what I've encountered here, but there are multiple other moving parts and I have to go with what I have - for the moment anyway.
Re: Semaphore puzzle
by jerryhone (Sexton) on Feb 23, 2021 at 09:15 UTC
    Thanks all. Just getting others' suggestions is often enough to trigger thoughts on different ways to achieve things. A couple of details that I hadn't included in my original post (I thought they'd add additional unnecessary clutter) is that
    (a) I send a request for the files - the names of my request and the returned files are the same other than the last letter.
    (b) The semaphore key I use is a hash of the file name(excluding the last letter) so that I can have multiple streams running together without interaction. It's clear that the big delay here is the time it takes to create the semaphore in the first place, so, my plan today is to use my requesting process to create and initialize the semaphore but not actually use it - that'll be left to the receiving processes. Hopefully the turn around time between request and response is enough for the semaphore to come into existence. Watch this space...
Re: Semaphore puzzle
by Anonymous Monk on Feb 23, 2021 at 17:40 UTC

    You need a way to know when you're acquiring or creating the semaphore. So you'd need another semaphore that is already open, and ready to use. Before the above code you need to acquire the semaphore before opening or creating like your code. What is most likely happening is while the first thread is attempting to open/create (most likely create)the second thread comes in and doesn't have to create, but just open the Sem. So you really need a kind of double checked locking.

    NON-PERL pseudo code
    SemLock = Semaphore->new (....) # some where global in the script if(sem) { SemLock->lock() sem->lock() SemLock->unlock() } else { SemLock->lock() sem= new Semaphore(....) sem->lock() SemLock->unlock() } doStuff() sem->unlock()

    Hope this helps, and makes sense

    dk
Re: Semaphore puzzle
by Anonymous Monk on Feb 23, 2021 at 01:21 UTC
    One alternative strategy would be to have the "triggered Perl script" simply save the data – perhaps under a randomly-coined name – then write an entry to a queue file or table giving a timestamp and the filename. A separate process, perhaps triggered by a crontab entry, wakes up to pull entries from this queue and actually post the data.
    A reply falls below the community's threshold of quality. You may see it by logging in.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11128659]
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (5)
As of 2024-03-28 13:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found