http://qs321.pair.com?node_id=507653


in reply to Remain calm and confident in difficult times

If you believe that I/O performance was the difference between Perl and Java on that test, then you believe that Perl's I/O system is so bad that it takes minutes to run through 100 MB of local data on a decent PC.

No sane person with even a modicum of knowledge about the relative costs of different computing operations would think of that first. They wouldn't have to test it to know that I/O was highly unlikely to be the cause of that much slowdown.

But you seem to think it worth testing, so I did a quick test on my laptop. Just running through and doing something trivial (counting lines) on a 5 MB file took me .266 seconds the firs time and 0.072 seconds on subsequent runs. Therefore the worst case just to run through 100 MB on my laptop would take under 6 seconds, and the vast majority of that worst case is spent with the operating system waiting for disk. (The difference between the first and later runs is that the file gets cached in RAM, so on later runs it doesn't go to disk.)

Should you wish to repeat my test on your machine, here is the program:

time perl -le '$l++ while <>; END {print $l}' BIG_FILE
In short, as I predicted, file I/O for that much data is a couple of orders of magnitude too small to explain the performance difference in Tim Bray's test.

As for your taking me to task for expecting Java to have not optimized this case, try compiling and running the following Java program:

import java.util.regex.*; public class FooTest { public static void main(String[] args) throws Exception { Pattern p = Pattern.compile("^(\\s*foo\\s*)*$"); Matcher m = p.matcher("foo foo foo foo foo foo foo foo foo foo foo + foo foo foo foo foo foo foo foo foo foo foo foo foo foo foo foo foo +foo foo foo foo foo foo foo foo foo foo foo foo foo foo foo foo foo f +oo foo foo foo foo foo foo foo foo foo foo foo foo foo foo foo foo fo +o foo foo foo foo foo foo foo fo"); if (m.find()) System.out.println("Matched"); else System.out.println("Didn't match"); } }
If that program hangs, then your version of Java does not implement the optimization that I said caused a significant slowdown in Perl. By my tests, Java does not implement that optimization, but perhaps you have a version that does. (I would bet money that you don't.)

So while you disparage the reasoning by which I concluded that Java was very unlikely to have that optimization, I seem to have come to the correct conclusion about whether or not it did. If you wish, you may believe that I just made a lucky guess.

Incidentally for future reference, if you're planning to criticize articles, it helps to provide links to them so that people can see what you're talking about. The article of mine that you were criticizing was Benchmarks aren't everything. I suspect that the other article was Enterprise Perl, but it might have been What is Enterprise Software? instead - I can't tell from what you've said.

Replies are listed 'Best First'.
A reply falls below the community's threshold of quality. You may see it by logging in.