Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: How do I measure my bottle ?

by RazorbladeBidet (Friar)
on Mar 25, 2005 at 13:33 UTC ( [id://442329]=note: print w/replies, xml ) Need Help??


in reply to How do I measure my bottle ?

There is a performance manager in the administrative tools for most recent versions of Windows (not sure what Win 2003 is?)... that would probably be your simplest bet. You can view it on a per-process basis. I'm sure there's a perl module to access this data if you so choose.

However, I'm a little confused. You're slurping a file into a scalar and using a 500mb hash key? or the first line of that file takes 15 seconds to read? Also, does the read take 15 seconds and hash insert take 20 seconds (35 seconds total) or is the total 20 seconds?

One option would be to have a multi-threaded program where one thread reads the file and the other works on the hash.
--------------
"But what of all those sweet words you spoke in private?"
"Oh that's just what we call pillow talk, baby, that's all."

Replies are listed 'Best First'.
Re^2: How do I measure my bottle ?
by cbrain (Novice) on Mar 25, 2005 at 13:43 UTC
    Thanks for your rapid reply, my test script is listed as following:

    script 1:

    while(my $l=<IN>){ }

    script 2:

    while(my $l=<IN>){ my $id=substr($l,0, 33); $hash{$id}=1; }

    script 1 takes me around 15 secs where as script2 takes me 20 secs.

      Then your hash insert is only taking 5 seconds (all other things being equal).

      There is the memory consideration, also (as stated below).

      Is this 20M records totalling 1GB or 1 TeraByte? (You mention 1,000 GB in your original post).

      Is there a reason you are using a hash? (in your example it looks like you could use an array, but I understand it is merely a "test")

      If you have many files (and it sounds like you do) - you could slurp in the entire file (one file at a time) and do the inserts, which will increase your memory usage but decrease CPU time. See File::Slurp
      --------------
      "But what of all those sweet words you spoke in private?"
      "Oh that's just what we call pillow talk, baby, that's all."

      If you want a handle on the cost of hash inserts, you may as well make the comparison more precise by making script 1 do something like:

      while(my $l=<IN>){ my $id=substr($l,0,33); $hash=1; }
      (Assuming of course, the compiler doesn't optimize any of this away.)

      the lowliest monk

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://442329]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (7)
As of 2024-04-19 10:52 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found