http://qs321.pair.com?node_id=885428


in reply to Re: Nobody Expects the Agile Imposition (Part VI): Architecture
in thread Nobody Expects the Agile Imposition (Part VI): Architecture

Thank you for a well thought out response. While the newer word "refactoring" seems to be pretty well-defined, I feel that the older word "rewriting" is not. From Martin Fowler's original Refactoring book:

Refactoring is the process of changing a software system in such a way that it does not alter the external behavior of the code yet improves its internal structure. ... In essence when you refactor you are improving the design of the code after it has been written.
From refactoring.com:
Refactoring is a disciplined technique for restructuring an existing body of code, altering its internal structure without changing its external behavior. Its heart is a series of small behavior preserving transformations. Each transformation (called a 'refactoring') does little, but a sequence of transformations can produce a significant restructuring. Since each refactoring is small, it's less likely to go wrong. The system is also kept fully working after each small refactoring, reducing the chances that a system can get seriously broken during the restructuring.
Hopefully, most folks will agree with those definitions. Now it gets much harder. For example, your opinion:
Subversion, git, and Mercurial are not rewrites of CVS.
does not agree with mine. My personal view is that Subversion was a "rewrite" of CVS, while the other two were not. I don't feel strongly though. I may well be "unorthodox", as you claim, yet I was pleasantly surprised to discover that many others, including Joel Spolsky, share my opinion. From Joel Spolsky:
You may also want to look into Subversion, a ground-up rewrite of CVS with many advantages.
From Open Source Software Development (wikipedia):
A good example of a complete rewrite was the Subversion version control system, whose developers started from scratch: they believed the codebase of CVS (an older attempt at creating a version control system), was useless and needed to be completely scrapped.
From Concurrent Versions System (c2.com)
SubVersion is a project to rewrite CVS from scratch, in a more flexible and extendible way - and then to extend it.
Finally, a probing (and relevant to this thread) question from Shlomi Fish interviews Ben Collins-Sussman:
Subversion was a re-write from the grounds up done by many of the original CVS workers. Do you think it could have been faster to replace CVS (or CVSNT) component by component, thus yielding Subversion?

To take another example, while I view Perl 6 as a "rewrite" of Perl 5, I suspect many monks would disagree with that view; a couple of them have already made that plain in this thread. Note however that Larry Wall at least seems to view Perl 6 as a "rewrite" of Perl:

Perl 5 was my rewrite of Perl. I want Perl 6 to be the community's rewrite of Perl and of the community.
Admittedly, that quote was taken from State of the Onion, TPC4, and the direction of Perl 6 has changed a bit since then. I'd be interested to know if Larry still views Perl 6 as a "rewrite" of Perl 5.

Open Source Software Development (wikipedia) neatly summarizes the available rewrite/refactor options:

Often open source developers feel that their code requires a revamp. This can be either because the code was written or maintained without proper refactoring (as is often the case if the code was inherited from a previous developer), or because a proposed enhancement or extension of it cannot be cleanly implemented with the existing codebase. A final reason for wishing to revamp the code is that the code "smells bad" (to quote Martin Fowler's Refactoring book) and does not meet the developer's standards. There are several kinds of revamps:
  1. Refactoring implies that the code is moved from one place to another, methods, functions or classes are extracted, duplicate code is eliminated and so forth - all while maintaining an integrity of the code. Such refactoring can be done in small amounts (so-called "continuous refactoring") to justify a certain change, or one can decide on large amounts of refactoring to an existing code that last for several days or weeks.
  2. "Partial rewrites" involve rewriting a certain part of the code from scratch, while keeping the rest of the code. Such partial rewrites have been common in the Linux kernel development, where several subsystems were rewritten or re-implemented from scratch, while keeping the rest of the code intact.
  3. Complete rewrites involve starting the project from scratch, while possibly still making use of some old code. A good example of a complete rewrite was the Subversion version control system, whose developers started from scratch: they believed the codebase of CVS (an older attempt at creating a version control system), was useless and needed to be completely scrapped. Another good example of such a rewrite was the Apache web server, which was almost completely re-written between version 1.3.x and version 2.0.x.

Apart from arguing over semantics, the interesting strategic decision we face is whether to extend an existing legacy code base or throw it away and start from scratch. There is no one "right" answer to that question: it depends on the project, the team, the quality of the existing code base, and many other factors. Perhaps the most important thing is striving to prevent legacy code degenerating into a tangled mess in the first place.

Update (2023):