Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re: Caching Format

by Eliya (Vicar)
on Jan 11, 2012 at 18:06 UTC ( [id://947399]=note: print w/replies, xml ) Need Help??


in reply to Caching Format

If I'm understanding you correctly, you could do away with the extra file (and the need for locking) by putting the info in the filenames themselves.  For example

CollectionID_IDX_N_imgname.jpg

where CollectionID is a unique collection identifier, N is the expected total number of images, and IDX the image number within the collection (the collection ID could also be a directory).  All you then have to do after having received a new image is a simple glob plus a check for completeness.

Update: forgot to mention that to avoid potential concurrency issues (reading yet incompletely written files), you'd rename a file to its final name only after you've finished writing it.

Replies are listed 'Best First'.
Re^2: Caching Format
by RichardK (Parson) on Jan 12, 2012 at 12:05 UTC

    It sounds like there are multiple collections in use at the same time, so it should be worth storing all in files for one collection in it's own directory. That will make examining the files quicker. But if there are thousands of files then sub directories could be help too.

     collectionNN/C000/files[0-99] collectionNN/C100/files[100-199] etc ...
      Hello RichardK, friends,

      Sorry for answering via a shotgun message. First, again thanks for your time and help. After I posted my first reply I began playing around with http://search.cpan.org/~cleishman/Cache-2.04/ and fell in love with it. It was simple and handled the metadata caching format nicely. Collection entries could be pushed/popped as hash key=>value pairs. It also handled file locking and provided many methods to do all of the things I needed to do. Unfortunately I found out later from my boss that not only are Dbases not allowed, but any Perl Module that is not a Perl5 core module cannot be used either. Mulligan!

      Regarding the heap vs files debate; I learned that the required level of persistence is actually quite high, certainly high enough to warrant the use of a Dbase if that was an option. Essentially collections will be kept indefinitely. That is the reason I chose to use files. I also found out for certain that I could not modify file names. As of now I plan on creating a pseudo-namespace for each collection by throwing collection metadata and files unique directories.

      Cheers,
      Hok

      P.S. I used a lot of buzzwords and somehow left out "Cloud" so there I said it.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://947399]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (3)
As of 2024-04-25 14:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found