or, how to rewrite your code all over again. And again. Like a lot of words, 'refactoring' gets tossed around quite a bit these days. Mostly without any particular regard to whether or not it is understood. At least I know that I'm probably guilty of this particular assumption. In the environment that I currently inhabit, I'm typically the most senior programmer and I know from the glazed-over expressions I get from time to time that I'm not always as 'clear' as I would like.
I thought what I would do is take a fairly short piece of code and walk it through the normal process that I use when coding---except this time, take a series of 'snapshots' of it along the way. Creating a kind of record over time of how the code evolved. The code in question has already been posted, (see Perl, Chess, Lies, Dam Lies and Statistics) so think of this as a map of how I got there.
Typically I like to start (no claim to great wisdom here...) with something that works! In fact, I'll take just about any piece of code that fits this loose description. In this case, what I did was
a nearly complete copy of a working SVG page. The first pass code looks like this:
my $y = 0;
my @list;
print "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n
+";
print "<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.0//EN\"\n";
print " \"http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.
+dtd\" >\n";
print "<svg height=\"200\" width=\"200\">\n";
print " <line x1=\"0\" y1=\"0\" x2=\"80\" y2=\"0\"/>\n";
print " <polyline\n";
print " points=\"";
foreach (@scores) {
if ($_) {
push(@list,$y++);
push(@list,$_ * -1.0);
}
}
print join(",",@list);
print "\"\n";
print " style=\"stroke: black; width: 1px; fill: none;\"/>\n";
print "</svg>\n";
Mostly print statements. Nothing too fancy, just enough to serve as a kind of proof of concept. It also provides a 'minimal' metric---each successive iteration should work at least this well, otherwise the changes are heading in the wrong direction!
It shouldn't take much staring at this code to provoke at least one or two sub routines! And if that doesn't provide the inspiration, reading the fine manual on SVG should. Whatever the cause, the next thing that I wrote was a sub called hline---itself a combination of the desire to remove the print statement with the line command and the knowledge that I would need more horizontal lines! So now the code looks like:
my $y = 0;
my @list;
print "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n
+";
print "<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.0//EN\"\n";
print " \"http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.
+dtd\" >\n";
print "<svg height=\"200\" width=\"200\">\n";
print " ",hline(0,80,0);
print " <line x1=\"0\" y1=\"0\" x2=\"80\" y2=\"0\"/>\n";
print " <polyline\n";
print " points=\"";
foreach (@scores) {
if ($_) {
push(@list,$y++);
push(@list,$_ * -1.0);
}
}
print join(",",@list);
print "\"\n";
print " style=\"stroke: black; width: 1px; fill: none;\"/>\n";
print "</svg>\n";
sub hline {
my $x1 = shift;
my $x2 = shift;
my $y1 = shift;
my $y2 = $y1;
print " <line x1=\"$x1\" y1=\"$y1\" x2=\"$x2\" y2=\"$y2\"/>\n";
}
And using hline as a template, we create vline and before you know we have pretty much filled in around the original simple line graph.
my $y = 0;
my @list;
print "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n
+";
print "<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.0//EN\"\n";
print " \"http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.
+dtd\" >\n";
print "<svg height=\"200\" width=\"200\">\n";
foreach (-15..15) {
print hline(0,80,$_);
}
foreach (0..80) {
print vline(-15,15,$_);
}
print " <polyline\n";
print " points=\"";
foreach (@scores) {
if ($_) {
push(@list,$y++);
push(@list,$_ * -1.0);
}
}
print join(",",@list);
print "\"\n";
print " style=\"stroke: black; width: 1px; fill: none;\"/>\n";
print "</svg>\n";
sub hline {
my $x1 = shift;
my $x2 = shift;
my $y1 = shift;
my $y2 = $y1;
return " <line x1=\"$x1\" y1=\"$y1\" x2=\"$x2\" y2=\"$y2\"/>\n";
}
sub vline {
my $y1 = shift;
my $y2 = shift;
my $x1 = shift;
my $x2 = $x1;
return " <line x1=\"$x1\" y1=\"$y1\" x2=\"$x2\" y2=\"$y2\"/>\n";
}
After a little review of SVG, hline and vline both undergo some changes, first to add functionality (I wanted to change color), second to simplify the code (single 'my' statement) and third change from 'line' to path (both an improvement and a simplification, all SVG line commands wind up as path commands anyway, why not go direct). Still nothing tricky, although hline was extended in a way that did not break my previous usage---new parameter added to the end of the list keeps the old approach working.
my $y = 0;
my @list;
print "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n
+";
print "<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.0//EN\"\n";
print " \"http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.
+dtd\" >\n";
print "<svg height=\"400\" width=\"400\" viewbox=\"0,0,400,400\">\n";
print "<g transform=\"translate(0,15) scale(1,-1)\">\n";
print "<g transform=\"scale(4)\">\n";
foreach (-15..15) {
print hline(0,80,$_);
}
foreach (0..80) {
print vline(-15,15,$_);
}
print hline(0,80,0,'red');
print " <polyline\n";
print " points=\"";
foreach (@scores) {
if ($_) {
push(@list,$y++);
push(@list,$_);
}
}
print join(",",@list);
print "\"\n";
print " style=\"stroke: black; width: 1px; fill: none;\"/>\n";
print "</g>\n";
print "</g>\n";
print "</svg>\n";
sub hline {
my ($x1,$x2,$y1,$color) = @_;
if ($color) {
return " <path d=\"M $x1,$y1 H $x2\" style=\"stroke: $color;\
+"/>\n";
}
else {
return " <path d=\"M $x1,$y1 H $x2\"/>\n";
}
}
sub vline {
my ($y1,$y2,$x1,$color) = @_;
if ($color) {
return " <path d=\"M $x1,$y1 V $y2\" style=\"stroke: $color;\
+"/>\n";
}
else {
return " <path d=\"M $x1,$y1 V $y2\"/>\n";
}
}
So what next? Well, there are still too many 'print' statements in the main line, and we have yet to graph the other two 'lines' of data. So with that as a target, we refactor into this:
my $y = 0;
my @list;
print "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n
+";
print "<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.0//EN\"\n";
print " \"http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.
+dtd\" >\n";
print "<svg height=\"450\" width=\"450\" viewbox=\"0,0,450,450\">\n";
print "<g transform=\"translate(10,85) scale(1,-1)\">\n";
print "<g transform=\"scale(5)\">\n";
foreach (-15..15) {
print hline(0,80,$_);
}
foreach (0..80) {
print vline(-15,15,$_);
}
print " <path d=\"M -1 0 H 81\" style=\"stroke: red; stroke-opacity:
+.25; stroke-width: .1\"/>\n";
polyline('black',0.1,@scores);
polyline('green',0.1,@bestblack);
polyline('red',0.1,@bestwhite);
print "</g>\n";
print "</g>\n";
print "</svg>\n";
sub hline {
my ($x1,$x2,$y1,$color) = @_;
if ($color) {
return " <path d=\"M $x1,$y1 H $x2\" style=\"stroke: $color;\
+"/>\n";
}
else {
return " <path d=\"M $x1,$y1 H $x2\"/>\n";
}
}
sub vline {
my ($y1,$y2,$x1,$color) = @_;
if ($color) {
return " <path d=\"M $x1,$y1 V $y2\" style=\"stroke: $color;\
+"/>\n";
}
else {
return " <path d=\"M $x1,$y1 V $y2\"/>\n";
}
}
sub polyline {
my $color = shift;
my $width = shift;
my @values = @_;
my @list;
my $y = 0;
print " <polyline\n";
print " points=\"";
foreach (@values) {
if ($_) {
push(@list,$y);
push(@list,$_);
}
$y++;
}
print join(",",@list);
print "\"\n";
print " style=\"stroke: $color; stroke-width: $width; fill: non
+e;\"/>\n";
}
That didn't do much for the 'print' population, but it did add some useful functionality and avoided repeating the original 'polyline' with the loop and embedded 'join' in the middle. One of the best reasons to refactor is code that repeats essentially the same thing. Generalize this kind of code as a sub as soon as you can.
So at this point I've accumulated 3 subs and I've still got a lot of print statements. All things considered, there will always be 'some' print statements, but I should at least be able to tidy up the main line a bit more. Regards this last, I can easily predict that 'openSVG' and 'closeSVG' are both pretty likely candidates for the sub list---but before I coded the obvious, I spent some time thinking about what I'd already done and what I might want or need later. Without too much deep thought, anyone familiar with tag-based languages should be able to see a pattern here. Each SVG command typically consists of a key-word ('path', 'line' etc.) and some attributes ( 'style', 'd' etc.) Maybe what I needed is a little bit of 'meta' code. A sort of half step back and one step to the side kind of code. More to the point, code that I can use to re-write the other SVG calls. Actually to tell the truth, I didn't make this connection until I was about half way through writing a variety of 'open' subs---the process itself was the same kind of repetition as I was trying to eliminate! Luckily it didn't take too much typing before I was struck by the less than subtle pattern involved. At any rate, enter a new sub entitled 'openTAG' paired with 'closeTAG'. Armed with this, we can pretty much rewrite into the final state:
openSVG( 450, 450 );
openG( transform => 'translate(10,85) scale(1,-1)' );
openG( transform => 'scale(5)' );
openG( style => "font-size: .75pt", transform => "scale(1,-1)");
my $n = 1;
foreach ( 0 .. 79 ) {
openTAG(
'text',
x => $_ + .25,
y => 16,
style => "font-weight: bold;",
);
print ">",$_ % 10;
closeTAG('text');
if ($_ and ($_ % 10 == 0)) {
openTAG(
'text',
x => $_ + .25,
y => 17,
style => "font-weight: bold;",
);
print ">",$n++;
closeTAG('text');
}
}
closeG();
foreach ( -15 .. 15 ) {
hline( 0, 80, $_ );
}
foreach ( 0 .. 80 ) {
vline( -15, 15, $_ );
}
path(
d => 'M -1 0 H 81',
style => 'stroke: red; stroke-opacity: .25; stroke-width: .1'
);
graphline( 'black', 0.1, @scores );
graphline( 'green', 0.1, @bestblack );
graphline( 'red', 0.1, @bestwhite );
closeG();
closeG();
closeSVG();
sub path {
openTAG( 'path', @_ );
closeTAG();
}
sub openTAG {
my $tag = shift;
my %attributes = @_;
print "<$tag";
foreach ( keys %attributes ) {
print " $_=\"$attributes{$_}\"";
}
}
sub closeTAG {
my $s = shift;
if ($s) {
print "</$s>\n";
}
else {
print "/>\n";
}
}
sub openG {
openTAG( 'g', @_ );
print ">\n";
}
sub closeG {
closeTAG('g');
}
sub openSVG {
my $height = shift;
my $width = shift;
print "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"
+?>\n";
print "<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.0//EN\"\n";
print " \"http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.dt
+d\">\n";
print
"<svg height=\"$height\" width=\"$width\" viewbox=\"0,0,$width,$height
+\">\n";
}
sub closeSVG {
closeTAG('svg');
}
sub hline {
my ( $x1, $x2, $y1, $color ) = @_;
if ($color) {
path( d => "M $x1,$y1 H $x2", style => "stroke: $color;" );
}
else {
path( d => "M $x1,$y1 H $x2" );
}
}
sub vline {
my ( $y1, $y2, $x1, $color ) = @_;
if ($color) {
path( d => "M $x1,$y1 V $y2", style => "stroke: $color;" );
}
else {
path( d => "M $x1,$y1 V $y2" );
}
}
sub graphline {
my $color = shift;
my $width = shift;
my @values = @_;
my @list;
my $y = 0;
foreach (@values) {
if ($_) {
push ( @list, $y );
push ( @list, $_ );
}
$y++;
}
openTAG(
'polyline',
points => join ( ',', @list ),
style => "stroke: $color; stroke-width: $width; fill: none;",
);
closeTAG();
}
This is where the code is today (it might even be evolved a bit further than the previous nodes listing), which is not to say that this is the end of the process. It's not! It is just the current at rest state. Further refactoring fuel is provided by the shortcomings of the current version...whatever that version may be! At a guess, I'll probably remove the code surrounding the drawing of the axis and the graph paper into it's own sub at some point. Clearly 'openSVG' could stand a bit more parameterize and so on. This is not a process that comes neatly to a halt. In fact if there is a problem with refactoring, it is that it is something that could go on pretty much forever---certainly longer than any given deadline. Part of managing either yourself or other programmers is in knowing when to bring this to a halt.
So this is my definition of 'Refactoring', or at least one of them. Is this all I mean? No, it's a pretty big subject and it certainly deserves a lot more than what I've given it here! Still this is at least the basis for what I mean when I talk about 'Refactoring'---I'm still convinced the eye glazing thing has to do with all of the damn doughnuts lying around...
Bibliography
@book{fowler:2,
author = {Martin Fowler and Kent Beck and John Brant and William Opdyke and Don Roberts},
title = {Refactoring: Improving the Design of Existing Code},
publisher = {Addison-Wesley},
address = {Boston, MA},
edition = {First},
year = {2000},
isbn = {ISBN 0201485672},
}
@book{eisenberg:1,
author = {J. David Eisenberg},
title = {SVG Essentials},
publisher = {O'Reilly},
address = {Sebastopol, CA},
edition = {First},
year = {2002},
isbn = {ISBN 0596002238},
}
--hsm
"Never try to teach a pig to sing...it wastes your time and it annoys the pig."