http://qs321.pair.com?node_id=605360


in reply to Re^3: 5x6-bit values into/out of a 32-bit word
in thread 5x6-bit values into/out of a 32-bit word

This is a very interesting point, one very much worth considering. In Perl, of course, the difference would not be worth mentioning, but in C, my unrolled-loop version is much faster than a version with a loop. I was guessing that it would be twice as fast, but when I actually measured it, it was actually five times as fast.

Now, of course, it was microseconds v. microseconds, but why are we packing 5 numbers into one 32-bit word? Presumably we care about the space usage, which would only matter if we are using a lot of them, probably millions, so all those microseconds can add up.

The maintenance concerns are, in this specific case, probably not valid. You can't pack 6x6-bit values in a 32-bit word, nor 5x7-bit values. We would have to change the algorithm if anything changed. And, if you actually write out the loop version, you've probably only saved one line of code.

I'm not disagreeing with your principles, but I think that in this case I would probably go with my version.

There's a very good essay, The Fallacy of Premature Optimization. One snippet:

Note, however, that Hoare did not say, "Forget about small efficiencies all of the time." Instead, he said "about 97% of the time." This means that about 3% of the time we really should worry about small efficiencies. That may not sound like much, but consider that this is 1 line of source code out of every 33. How many programmers worry about the small efficiencies even this often? Premature optimization is always bad, but the truth is that some concern about small efficiencies during program development is not premature.