The semicolon is an interesting case. I'm leaning towards not counting it, as I can't think of a case where the shorter of two solutions would turn out to be longer depending on semicolon sensitivity. If such a case exists, this would take further thought.
As for automated counting, my idle musing was to abuse the Perltidy code for that purpose.
I don't like the Benchmark.pm idea - you'd have to specify the exact environment (Perl version+patches, OS, hardware, modules' versions) to be able to decide on a reproducibly and unambiguously winning entry. It's just too fickle a goal.
Makeshifts last the longest.