|Perl Monk, Perl Meditation
You could use a SysV IPC to do this. It provides semaphores for locking and persistent shared memory for sharing data. I use it for one project and it is very fast, though I don't recall the benchmark numbers. See perlipc, shmget, and IPC::SysV.
You could also use Sys::Mmap, but you would have to figure out something else for locking. lockf should work fine; it would surprise me if this were very slow if done carefully.
Finally, you could write just the counter part in C, and use mmap to store the counters and atomic operations for consistency. Intel's threading building blocks provide a useful way to do this on Intel hardware, but most modern hardware has something like this. Using Inline::C it's not too hard to mix Perl and C.