Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

TIE:: to a compressed file ?

by ShaneMetler (Initiate)
on Jul 22, 2002 at 17:23 UTC ( [id://184139]=perlquestion: print w/replies, xml ) Need Help??

ShaneMetler has asked for the wisdom of the Perl Monks concerning the following question:

hi everyone,

i'm working on a search engine project and i am using my file system & text files as part of the search engine's database.

i have been using Tie::File to access these DB text files directly (if i understand correctly, using TIE does not pull the whole text file into memory, thus saving time & memory when working with really large files??)

i have also been using Compress::Zlib to compress large chunks of data to inside the DB text files ... each text file in our DB is stored as an array and looks like this:

numeric id | plain text data | zipped big chunk of data \n
numeric id | plain text data | zipped big chunk of data \n

i'm thinking this is useful because i can TIE to the file, pull any array element, then uncompress the zipped data to use.

now my question is two fold ...

1) is this a good srategy to manage large volumes of data or am i missing something?

2) i would also like to compress each DB text file (in addition to the already zipped chunks of data in each file), but i can't figure out how to TIE into the zipped file without pulling the whole text file into memory.

hope this makes sense ... needless to say, i'm a big fat newbie when it comes to PERL.

Shane

Replies are listed 'Best First'.
Re: TIE:: to a compressed file ?
by stefp (Vicar) on Jul 22, 2002 at 18:30 UTC
    Once you have zipped the file, you loose the notion of lines. The consequence is that you can't use both Compress::Zlib and Tie::File at once.

    -- stefp -- check out TeXmacs wiki

(RhetTbull) Re: TIE:: to a compressed file ?
by RhetTbull (Curate) on Jul 22, 2002 at 23:15 UTC
    Take a look at IO::Zlib. I believe it'll do exactly what you want -- that is, tie a compressed file and treat it like an ordinary file handle (a la Tie::File). In your case, I wouldn't keep the large data record compressed AND use a compressed data file -- I don't see that it buys you anything. Just use IO::Zlib and compress the whole file.
    --RT
Re: TIE:: to a compressed file ?
by Aristotle (Chancellor) on Jul 23, 2002 at 00:44 UTC

    Tie::File streams the file off disk without slurping, but so would using a simple while(<>). On the other hand, the zipped chunk of data may contain newlines, and as far as I can tell you are not accounting for that. You can go to the trouble of zipping the "database files", but the savings will likely be marginal, and hardly justify the trouble you have to go through.

    I don't think it's a good strategy, really. Is there any specific reason not to use an actual database? Or, barring that, for whatever reason, maybe something like Archive::Zip?

    Makeshifts last the longest.

      I wanted to do something similar. The suggestion of IO::Zlib and Tie::File seemed like a nice idea, but IO::Zlib doesn't support seeking as required by Tie::File. Thanks, Arthur

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://184139]
Approved by VSarkiss
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (6)
As of 2024-04-23 08:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found