Re: Caching Format

If I'm understanding you correctly, you could do away with the extra file (and the need for locking) by putting the info in the filenames themselves. For example

    CollectionID_IDX_N_imgname.jpg
[download]

where CollectionID is a unique collection identifier, N is the expected total number of images, and IDX the image number within the collection (the collection ID could also be a directory). All you then have to do after having received a new image is a simple glob plus a check for completeness.

Update: forgot to mention that to avoid potential concurrency issues (reading yet incompletely written files), you'd rename a file to its final name only after you've finished writing it.

Comment on Re: Caching Format Select or Download Code

Replies are listed 'Best First'.
Re^2: Caching Format by RichardK (Parson) on Jan 12, 2012 at 12:05 UTC
It sounds like there are multiple collections in use at the same time, so it should be worth storing all in files for one collection in it's own directory. That will make examining the files quicker. But if there are thousands of files then sub directories could be help too. `collectionNN/C000/files[0-99] collectionNN/C100/files[100-199] etc ...`	[reply] [d/l]
Re^3: Caching Format by hok_si_la (Curate) on Jan 13, 2012 at 23:15 UTC
Hello RichardK, friends, Sorry for answering via a shotgun message. First, again thanks for your time and help. After I posted my first reply I began playing around with http://search.cpan.org/~cleishman/Cache-2.04/ and fell in love with it. It was simple and handled the metadata caching format nicely. Collection entries could be pushed/popped as hash key=>value pairs. It also handled file locking and provided many methods to do all of the things I needed to do. Unfortunately I found out later from my boss that not only are Dbases not allowed, but any Perl Module that is not a Perl5 core module cannot be used either. Mulligan! Regarding the heap vs files debate; I learned that the required level of persistence is actually quite high, certainly high enough to warrant the use of a Dbase if that was an option. Essentially collections will be kept indefinitely. That is the reason I chose to use files. I also found out for certain that I could not modify file names. As of now I plan on creating a pseudo-namespace for each collection by throwing collection metadata and files unique directories. Cheers, Hok P.S. I used a lot of buzzwords and somehow left out "Cloud" so there I said it.	[reply]


Don't ask to ask, just ask
	PerlMonks