http://qs321.pair.com?node_id=434479


in reply to Re: How to maximise the content of my data CD
in thread How to maximise the content of my data CD

Won't always work, though I can't prove how far off it'll be. Mainly, it should work pretty well when you have lots of extra space, but an approximate solution won't be "close enough" if you end up too close to exactly filling all the discs. Consider files of size 350, 349, 233, 232, and 231. You can fit them on 2 700 Mb discs (350+349, 233+232+231), but your algorithm will use 3 dics. (If you tried to use only 2, you'd end up with 350+233, 349+232, and the 231 wouldn't fit on either).

What I can't prove without a lot more thought is whether you're ever going to be off by more than a single disc, and what can't be proven at all is whether that's close enough for real world purposes. (Since what's "acceptable" in the real world has to do with how long you're willing to wait for an answer vs. how much you care about that extra disc, and other factors.) But just know that the greedy approach won't only be suboptimal in theory, but it will also, sometimes, bleed over into an actual difference.

Replies are listed 'Best First'.
Re^3: How to maximise the content of my data CD
by Limbic~Region (Chancellor) on Feb 25, 2005 at 14:40 UTC
    Eimi Metamorphoumai,
    Won't always work,...

    What I didn't say explicitly, but was implied by my bullet points was that 1 disc is being added to account for perfect (or even near perfect fits). The knapsack problem is hard but we aren't trying to break encryption we are trying to save a few pennies on CDs. I don't think (though I could be wrong) that it will ever waste more than 1 disk. Too make matters more difficult, we aren't talking about a handful of files but more likely hundreds if not thousands. Let's say that the total size is an exact multiple of 1 CD. That means every single CD needs to be an exact match (which may not even be possible). Proving it can or can not might take a while (extreme sarcasm). Why not just go with a "good enough" solution?

    Update 2008-11-26: It turns out that this heuristic approach can be is much as 11/9 OPT + 1 bin (according to bin packing). While my experience has been that 1 extra is all you will ever need, it is possible to need more.

    Cheers - L~R