in reply to (OT) Rewriting, from scratch, a huge code base
That said, I disagree with the thesis. I do not believe that old code is good code. I do not believe that all code is worth rescuing. There are times a rewrite is necessary. Here are a few reasons that I have done rewrites in the past and would do them again:
- The code is scattered across several languages and rewriting in just one would allow more consistent argument processing and error handling. For instance I once removed a lot of old Expect scripts with Perl for this reason.
- The code relies on interfaces that are fundamentally broken. For instance code built on Text::CSV is unable to handle embedded newlines. Given a system which processes csv files, and needs to handle embedded newlines, the code has to be fixed and the broken module removed.
- The code is of sufficiently low quality that fixing it is harder than replacing it. IBM found in the 80's that when they tracked bugs, something like 10% of the components were most of the bugs. Rewriting those components (once identified) from scratch significantly reduced overall bug counts.
- The code is full of hard-coded information (eg paths) that you need to track down and replace with something more flexible. A particularly good opportunity for this is when you need to move it from one machine to another. Choices, choices, recreate the environment that it needs and dependencies that are not documented, or replace its functionality with a version that is more portable?
- The system depends on a component that you are trying to eliminate. Fairly often a system will have two parts that do pretty much the same thing. Life would be easier if you were only using one of them. (Less to remember, easier to teach people how things work, etc.) In the process of doing that, parts that use the losing component will be replaced as opportunity permits.
In other words the primary reason for rewriting the renderer in Netscape 4 was not performance, it is that the old render engine tied them to an inherently buggy API that was biting them over and over again. The performance was a visible second issue. Of course rewriting the rest of Netscape went beyond that...
However he is right that the worst way to do a rewrite is to sit down and start writing something completely new from scratch. Instead I like to work as I suggested in Re (tilly) 1: Best way to fix a broken but functional program?. Decide on an overall flow of a new design. Pull some of the existing mess out and make it a shell around the new design. For instance if you need a new rendering engine, then release something with the old, spec out the new engine. Then start incrementally writing the new, as you go scooping out from the old. It may take longer, but you don't lose the existing knowledge. You don't stop yourself from delivering the product. And if you do it right, at some point the old becomes a small shell that you can kill when you get the right moment.
Now some may tell me that I just invented refactoring. I disagree. The fundamental principle of refactoring is to incrementally rewrite code through a series of transformations. This is a technique for writing a new project from scratch, while analyzing the old code very carefully for important things that it had, and with the plan of throwing the old code away when you can. Conceptually refactoring is the process of transforming cr*p into soil. This is a process of incremental replacement laying the ground for apparently catastrophic replacement once the new foundation is there.
Now this doesn't mean that I don't think that refactoring is a great idea. It is. But trying to preserve something just because it is already written is a mistake in my books.