Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
Hey mothra, I guess I know why you haven't been to the local Perl monger meetings in a while ... :-)

Well...not quite. :) I've been busy, planning on moving to Europe, trying to sell my car, etc. In January, I was in London and Amsterdam, and got together with a couple of the London.pm'ers.

My motivations are much more to do with finding the tool that lets me be as lazy as possible.

Now, quickly, on to the code (I have to actually do some work right away, heh).

First off, I was hoping to say that Python's fileinput module (its input() method specifically) was equivalent to Perl's <>, however it isn't. I've sent a message to comp.lang.python to try and understand why they work differently, because I understood them to be the same, until I tried to map it onto the wc program.

Anyways, to the code.

First off, I ran your programs on my machine (Pentium 733, 256 MB RAM, cygwin, NT4WS, Python 2.2, Perl 5.6.1), and large.txt was an 11 M file.

$ time ./wc.py wc.pl wc.py large.txt 21 58 494 wc.pl 25 96 698 wc.py 382230 1290003 11930691 large.txt 382276 1290157 11931883 total real 0m7.922s user 0m7.218s sys 0m0.124s $ time ./wc.pl wc.pl wc.py large.txt 21 58 494 wc.pl 25 96 698 wc.py 382230 1290003 11930691 large.txt 382276 1290157 11931883 total real 0m4.484s user 0m4.186s sys 0m0.093s

Then, I made some changes to the Python:

#!/usr/bin/python import sys files = map(lambda f: open(f), sys.argv[1:]) or [sys.stdin] Twords = Tlines = Tchars = 0 for file in files: words = lines = chars = 0 for line in file.xreadlines(): lines += 1 words += len(line.split()) chars += len(line) print "%7d %7d %7d %s" % (lines, words, chars, file.name) Twords += words Tlines += lines Tchars += chars if len(sys.argv) > 2: print "%7d %7d %7d total" % (Tlines, Twords, Tchars)

With the following results:

$ time ./wc.py wc.pl wc.py large.txt 21 58 494 wc.pl 17 74 518 wc.py 382230 1290003 11930691 large.txt 382268 1290135 11931703 total real 0m6.157s user 0m6.046s sys 0m0.124s
It seems you were using a fairly old version of Python. Version 2.1 sped up line-by-line file access.

So, for what point? I'm not sure, but you said you were interested in seeing a better (though I'd definitely not dare claim "best") version of the Python code, so there's my contribution. :) Also, it's worth noting that the speed differences in the example are neglible.

Update I: words = lines = chars = 0 might be slightly more idiomatic. I also would have written the map code (in the Python version) all on one line. That's a style difference, I guess. :)

Update II: Okay, I put the changes in the Python code mentioned in Update I.

Update III: And, for those who claim Python "forces" you into its own coding style, note that I could have written the map code using a list comprehension instead:

files = [open(f) for f in sys.argv[1:]] or [sys.stdin]

Python gives you more than one way to do it. IMHO it "takes away your options" in places where too many options are a Bad Thing anyway. (e.g. one way to define func parameters instead of using shift or @_ in Perl), totally eliminating any concerns about differences in {} style, because they're gone, etc.)


In reply to Re: Re: Perl vs. Python: Looking at the Code by mothra
in thread Perl vs. Python: Looking at the Code by mothra

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (9)
As of 2024-04-19 09:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found