Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re: OT: Locking and syncing mechanisms available on *nix.

by Illuminatus (Curate)
on Mar 27, 2011 at 18:31 UTC ( [id://895783]=note: print w/replies, xml ) Need Help??


in reply to OT: Locking and syncing mechanisms available on *nix.

*nix pthreads really only provides condition variables and mutexs as thread-level mechanisms for safeguarding concurrency. At the system level, *nix typically provides semaphores, which can be a kind of combination of these (ie counting semaphores can be used to mimick similar behavior to condition variables). Semaphores are also system wide, and can be used to sync access to things beyond process scope (shared-mem, files, etc) I think you are misinformed about the performance of mutexes and condition vars. Both are quite fast, and do not, by default, involve polling. Even semaphores have very limited performance implications. here is what I consider to be a good description on the use of condition variables, and why mutexes are also necessary.

I work on a tcp-proxy system where performance is at a premium (on Linux), and the mutex/cond-wait mechanism is never even a blip when when we profile it looking for bottlenecks.

fnord

  • Comment on Re: OT: Locking and syncing mechanisms available on *nix.

Replies are listed 'Best First'.
Re^2: OT: Locking and syncing mechanisms available on *nix.
by BrowserUk (Patriarch) on Mar 27, 2011 at 20:25 UTC
    and why mutexes are also necessary.

    Sorry, but that example is contrived to need a mutex. The signalling condition uses a global variable: Lock associated mutex and check value of a global variable.

    Now consider the case of a signalling condition that doesn't involve a global variable. Then what use is the mutex?

    As for the performance characteristics of mutexes. If I am misinformed, then so is half of the internet it seems.

    With regard to your TCP/IP example. It is understandable that mutexes are not a bottleneck where IO is involved. Even the 300 to 400 cycles that it takes to make a ring3 - ring0 - ring3 transition pale when compared to IO waits.

    But user-space only, atomic instructions, even with barrier instructions are much faster for memory to memory operations.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Well, I can't speak for Windows (which, I gather from your posts over time you are an expert in), but for Linux, I believe you are going to find the scheduler manipulation is going to be more overhead than the signaling you choose. If you really only want a thread to get to a certain point, and then wait until it is told to continue, then you can use suspend/continue. The first thread can call pthread_suspend on itself, which will take it off the run queue (so it no longer takes any CPU). Then the second thread can use pthread_continue to 'signal' it to continue. Your only overhead will be manipulation of the queues in the scheduler, which will have to happen regardless of your mechanism.

      fnord

        Usage scenario. Most of the time, the producer will be running on one core and the consumer on another, and they will producing and consuming from their respective ends of the shared memory structure as fast as they can go. No locking; no synching; no (elective) context switching.

        Occasionally, one end or the other will get preempted for some higher priority thread. At this point, the shared data structure will become either full, or empty depending upon which end is still running. At that point, that end needs to enter a wait state until the other end gets another timeslice, does its thing, relieving the empty or full state and waking up the other end to continue.

        Most of the time, given a correctly sized, and well-written buffering data-structure, the above scenario is both lock-free, wait free and requires no system calls (ring3/ring0/ring3 transitions). Both consumer and producer threads are free to run as fast as their processing requirements allow them and utilise their full timeslices. The latter point is the key to maximum utilisation.

        If I use suspend/resume, buffer empty/full conditions are guaranteed to not only require a multiple calls into the kernel, but also (at least one) very expensive context switch. If I use cond_vars and (unneeded) kernel mutexs, this also means an expensive call into the kernel for every read & write.

        The whole point of lock-free & wait-free algorithms is that they avoid both: expensive calls into the kernel; and expensive elective context switches--ie. non-pre-emptive ceding of the cpu--in order to make full use of each time-slice allotted.

        The point of Fast, user-space mutexes is that they run in user-space, and are therefore faster.

        The (lock-free/wait-free) algorithms are getting better and better defined. The hardware support (CAS, XCHG and similar SMP atomic instructions) is getting better and better with every new generation of processors.

        The limitations are currently locking, syncing and signalling mechanisms designed for single-processor/core IPC purposes. Given that much of the HPC research is done on *nix boxes of one flavour or another, I know there are better mechanisms out there. This thread was meant to be about enlisting help to find them, not argue about whether they are possible, or even required.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://895783]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (2)
As of 2024-04-25 22:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found