http://qs321.pair.com?node_id=156667


in reply to Re: Perl vs. Python: Looking at the Code
in thread Perl vs. Python: Looking at the Code

Hey mothra, I guess I know why you haven't been to the local Perl monger meetings in a while ... :-)

Well...not quite. :) I've been busy, planning on moving to Europe, trying to sell my car, etc. In January, I was in London and Amsterdam, and got together with a couple of the London.pm'ers.

My motivations are much more to do with finding the tool that lets me be as lazy as possible.

Now, quickly, on to the code (I have to actually do some work right away, heh).

First off, I was hoping to say that Python's fileinput module (its input() method specifically) was equivalent to Perl's <>, however it isn't. I've sent a message to comp.lang.python to try and understand why they work differently, because I understood them to be the same, until I tried to map it onto the wc program.

Anyways, to the code.

First off, I ran your programs on my machine (Pentium 733, 256 MB RAM, cygwin, NT4WS, Python 2.2, Perl 5.6.1), and large.txt was an 11 M file.

$ time ./wc.py wc.pl wc.py large.txt 21 58 494 wc.pl 25 96 698 wc.py 382230 1290003 11930691 large.txt 382276 1290157 11931883 total real 0m7.922s user 0m7.218s sys 0m0.124s $ time ./wc.pl wc.pl wc.py large.txt 21 58 494 wc.pl 25 96 698 wc.py 382230 1290003 11930691 large.txt 382276 1290157 11931883 total real 0m4.484s user 0m4.186s sys 0m0.093s

Then, I made some changes to the Python:

#!/usr/bin/python import sys files = map(lambda f: open(f), sys.argv[1:]) or [sys.stdin] Twords = Tlines = Tchars = 0 for file in files: words = lines = chars = 0 for line in file.xreadlines(): lines += 1 words += len(line.split()) chars += len(line) print "%7d %7d %7d %s" % (lines, words, chars, file.name) Twords += words Tlines += lines Tchars += chars if len(sys.argv) > 2: print "%7d %7d %7d total" % (Tlines, Twords, Tchars)

With the following results:

$ time ./wc.py wc.pl wc.py large.txt 21 58 494 wc.pl 17 74 518 wc.py 382230 1290003 11930691 large.txt 382268 1290135 11931703 total real 0m6.157s user 0m6.046s sys 0m0.124s
It seems you were using a fairly old version of Python. Version 2.1 sped up line-by-line file access.

So, for what point? I'm not sure, but you said you were interested in seeing a better (though I'd definitely not dare claim "best") version of the Python code, so there's my contribution. :) Also, it's worth noting that the speed differences in the example are neglible.

Update I: words = lines = chars = 0 might be slightly more idiomatic. I also would have written the map code (in the Python version) all on one line. That's a style difference, I guess. :)

Update II: Okay, I put the changes in the Python code mentioned in Update I.

Update III: And, for those who claim Python "forces" you into its own coding style, note that I could have written the map code using a list comprehension instead:

files = [open(f) for f in sys.argv[1:]] or [sys.stdin]

Python gives you more than one way to do it. IMHO it "takes away your options" in places where too many options are a Bad Thing anyway. (e.g. one way to define func parameters instead of using shift or @_ in Perl), totally eliminating any concerns about differences in {} style, because they're gone, etc.)