You may find
Using DBM::Deep and Parallel::ForkManager for a generalized parallel hashmap function builder (followup to "reducing pain of parallelization with FP") and/or
Using functional programming to reduce the pain of parallel-execution programming (with threads, forks, or name your poison) to be helpful.
I admit however, that after playing around with different ways to get WWW::Mechanize behaving in a parallel way, eventually I gave up, shrugged, and kept doing things serially. (Parallel in my case turned out to be a bit of a premature optimization, other features were more important.)