Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW

Re: slurped scalar map

by dragonchild (Archbishop)
on Jun 20, 2006 at 14:29 UTC ( #556402=note: print w/replies, xml ) Need Help??

in reply to slurped scalar map

Optimize for correctness, first. Parse that into a hash and get it working. Then, if it's not fast enough (and I highly doubt that will be a problem), then come back and ask a question with a working implementation.

My criteria for good software:
  1. Does it work?
  2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?

Replies are listed 'Best First'.
Re^2: slurped scalar map
by 0xbeef (Hermit) on Jun 20, 2006 at 15:41 UTC
    I am already past the "working" phase and in the "optimisation" phase.I'm curious about efficiency in terms of "best programming practise". The program (to large to post) creates a file consisting of N records and an index containing key info like fpos markers at the end. (the records consist of the stdout/stderr of several o/s commands and files => 30-50Mb/server for almost 100 servers)

    The program currently reads the index first, then processes & reads each record as it requires it while processing the data-file. I'm trying find a faster solution, i.e. performing larger sequential reads upfront. Of course, it may have extra considerations, such as an max. slurp size.

    This exercise will be worth it (in my mind at least) if I can understand the margin by which

    <sequential slurp><process><process><process>

    operations are faster than

    <slurp 1 record><process><slurp next record><process> ...

    Hope this makes sense.


      The OS already does that for you. When you read from a file, you're not actually reading from the disk itself. You read from a buffer than the disk manager creates for you. So, slurp-process-slurp-process is going to be nearly as fast (or faster) as slurp-process-process-process.

      My criteria for good software:
      1. Does it work?
      2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
        In principle I agree, except on one important detail.

        How can the OS buffer multiple small seek/read operations as effectively as when it is doing a single seek to start-of-file and performing a large sequential read of the entire content?

        On Unix systems the readahead buffer is continuously increased with logic something like "ok you're still reading sequentially - let me double the readahead buffer when doing the next read".


Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://556402]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (3)
As of 2022-06-27 04:39 GMT
Find Nodes?
    Voting Booth?
    My most frequent journeys are powered by:

    Results (86 votes). Check out past polls.