Wow, great summary! One comment. You say:
Using mmap() to do IPC (inter process communication) is a rotten idea. It's impossible to check for a lock and then lock it if its available in a single operation except using special instructions in the CPU, so without writing XS, you can't do locking operations on data in mmap'd areas. This means that any program that attempts to use mmap'd areas for IPC is going to have race conditions that cause that program to lock up or lose data sooner or later.
For file-backed mmap, it seems like fcntl range-locking would do the trick, although of course it requires a syscall and so would take longer than a CPU instruction. Is there some reason I haven't thought of that this won't work, or is otherwise a horrible idea?