C or C++ for this sort of computation is just too fast for Perl. Even Perl can't beat Java in this. The intensive loop in your code reminds me of "sieve of eratosthenes" benchmarks (see
the Great Win32 Computer Language Shootout or
the original shootout) on which Perl is far away behind Java.
I have tried a bit vector version of it, but still get bad result, so I'm sure the slowness is in the arithmetics. Monks here have played golf on eratosthenes, but I don't know about their attempst on performance hacks.