|No such thing as a small change|
Re: What do you know, and how do you know that you know it?by hv (Parson)
|on Aug 02, 2004 at 12:44 UTC||Need Help??|
Interesting meditation. :)
As for what I know that I know: I'd limit that to mathematics of a particular formal sort, that I've proved myself, along the lines of "given these axioms, and these rules of logic, this conclusion follows". I'd like to see a project to prove all known maths in this way, using declared rules of logic and steps small enough to permit computer verification - it boils down to nothing but symbolic manipulation. (I even registered "axiomattic.org" for a while, hoping to start the project myself.)
An important aspect of this approach is the definition of 'axiom'. Sometimes axioms are defined as "things that are so obviously true that they need no proof", but that is not how I use the term - if I did I'd have to deny all axioms. Instead, axioms are simply "the things assumed true for the purposes of this proof", and the result of a proof is "assuming these axioms and these rules, this necessarily follows".
Within the realm of programming, I had a very fortunate and salutary experience around 1991, when working on a 6800-based machine with 48K of RAM at a time when I was familiar with every line of code in the O/S. I encountered an unexpected error which halted the machine, and because of the limited RAM saving and printing out the entire core was a practical thing to do.
Tracking back from the location of the error showed that some memory had been corrupted ... by an error elsewhere, in which some memory had been corrupted ... by an error elsewhere, in which some memory had been corrupted ... in a very strange way. It turned out that one instruction had been changed from 0xA8 0x01 (LDAA+ 1 - load register A with the byte pointed to by the pointer register + 1) to 0xA9 xxx, an invalid opcode which nonetheless by symmetry should represent an instruction like 'STS#' - store stack immediate.
The processor had actually skipped a byte, then stored the stack pointer over the following two bytes, and the rest of the mayhem was caused by that. What changed the 0xA8 to 0xA9? Probably an alpha particle: examining that byte of memory showed that it had a parity error. (With the help of a colleague, I was able to account for every byte in memory other than that one.)
That experience in my formative years was very instructive: no matter how defensively you code, the fallibility of hardware means there are no guarantees, ever. This is why important applications use redundancy (see "#8: Why five computers?" under that link).
A couple of years later I learned another important lesson. By now I was writing C code for DOS and Windows, using Microsoft's compiler which was buggy as hell. Every few days I'd track down a problem, contact MS technical support, work my way through the layers of gap-toothed flunkeys determined to point me at a page of a manual I knew better than they, eventually get through to a hallowed programmer who'd (on a good day) look at my test case, confirm that it was a bug, and promise me that it'd be fixed in the next release - which was 6 months away, would cost me more money, and would introduce myriad new bugs.
After a while I noticed that my progress had slowed down purely because of my lack of trust in the tools - whenever my program failed to behave as expected, my first port of call was the debugger's facility to show the C code side by side with the generated assembler, to try and discover how the compiler had screwed up this time. But bad as the compiler was, the majority of the time the bug was in my code, and I wasted an awful lot of time ruling out the compiler each time before examining my own code.
So from this I learnt the importance of good tools (and the value, only realised when I discovered the open source community a few years later, of being able to fix the tools yourself). I also learnt that chances are, it's me that screwed up this time. And I learnt, more generally, that I (and by observation others) will always look first to blame anything other than our own code.
Update: s/cosmic ray/alpha particle/ after reading http://www.science.uva.nl/~mes/jargon/c/cosmicrays.html