Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
Berkley DB has been mentioned a few times here. I think that this is probably the best idea in your situation (though I would strongly encourage migrating to a good sturdy DBMS (if you are sorting that much data). For just a couple hundred records, you will probably not notice much difference, but if this sucker is getting big... Well...

Of course, you should use a database hash and that will sort things out nicely

Another thought, is that you can breathe a LOT of life into flatfiles with a few simple methods.
One simple one is to use your filesystem to provide some of the services of a DBMS, but I don't really recommend that (IE, directories labelled as user numbers)
Another idea is to put some forethought into your file format. You can put tables in the front to sort the data, run simple hashes over files to speed up searching, use tree implementations and such. An important fact is that you don't actually need to visit each record in order to search a file. If the data is ordered, you can order records, and search positionally (kind of like the number game where the computer says "higher" or "lower"). This will speed up your search time exponentially (literally). Search time for a B-Tree is log(2)n visitations, through the judicious use of file pointers, this can be used in any which way you like. The simple fact of the matter though, is that implementing such a system is rather mind racking (there is probably a pm that allows something to this effect).

I personally would try splitting your file across some criteria into smaller files, which can be searched flatly, and hand it off to IT for a proper DBMS if access time gets to be a problem.

Just Another Perl Backpacker

In reply to Re: When is a flat file DB not enough? by Nitsuj
in thread When is a flat file DB not enough? by TrinityInfinity

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (2)
As of 2022-06-25 23:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My most frequent journeys are powered by:









    Results (83 votes). Check out past polls.

    Notices?