http://qs321.pair.com?node_id=938988


in reply to Re^2: "Just use a hash": An overworked mantra?
in thread "Just use a hash": An overworked mantra?

For this problem, in which the known solution-space is constrained to what can fit into a reasonably sized hash and in which the total number of records and data-streams also fits into memory ... a memory-based solution works just fine, and there is utterly no reason to trundle out n-digit numbers to “prove” your point.

My original comment, which I said even at that time was ancillary to the original discussion, is that there do exist other classes of problems which for various reasons do not lend themselves well to the “random-access based” (and to “memory-based”) approaches that might occur to you on first-blush.   This might not be one of those cases, but it does not invalidate the fact that such problems do exist.   In those problems, the incremental costs of virtual-memory activity become a death by a thousand cuts.   A fundamental change of approach in those cases transforms a process that runs for days, into one that runs in just a few hours.   I have seen it.   I have done it.   “Batch windows” are a reality for certain common business computing jobs.   Last year I worked on a system that processes more than a terabyte of new data, assimilated from hundreds of switching stations, every single day, and this was the change that gave them their system back.

I was really, really hoping that in this case you wouldn’t rush out once again to prove how smart you are.   Let alone, as so many times before, publicly and at my expense.   Enough.