http://qs321.pair.com?node_id=317520


NAME

Often Overlooked Object Oriented Programming Guidelines


SYNOPSIS

The following is not about how to write OO code in Perl. There's plenty of nodes covering that topic. Instead, this is a general list of tips that I like to keep in mind when I'm writing OO code. It's not exhaustive, but it does cover a number of areas that I see many people (including myself), get wrong or overlook.


PROBLEMS

Useless OO

  • Don't use what you don't need.
  • Don't use OO if you don't need it. No sense in creating an object if there is nothing to encapsulate.
    sub new { my ($class,%data) = @_; return bless \%data, $class; }

    This constructor is not unusual, but it's suggestive of a useless use of OO. A good example of this is Acme::Playmate (er, maybe not the best example). The module is comprised of a constructor. That's it. And here's the documented usage:

    use Acme::Playmate; my $playmate = new Acme::Playmate("2003", "04"); print "Details for playmate " . $playmate->{ "Name" } . "\n"; print "Birthdate" . $playmate->{ "BirthDate" } . "\n"; print "Birthplace" . $playmate->{ "BirthPlace" } . "\n";

    Regardless of whether or not you feel this is a useful module, there's nothing OO about it. In fact, with the exception of methods this module inherits from UNIVERSAL::, it has no methods other than the constructor. All it does is return a data structure that just happens to be blessed (the jokes are obvious; we don't need to go there).

    Of course, this is merely an Acme:: module, so discussing how well a joke conforms to good programming practices is probably not warranted, but read through Damian Conway's 10 Rules for When to Use OO to get a good feel for when OO is appropriate.

Object Heirarchy

  • Don't subclass simply to alter data
  • Subclass when you need a more specific instance of a class, not just to change data. If you do that, you simply want an instance of the object, not a new class. Subclass to alter or add behavior. While I don't see this problem a lot, I see it enough that it merits discussion.

    package Some::User; sub new { bless {}, shift; } sub user { die "user() must be implemented in subclass" } sub pass { die "pass() must be implemented in subclass" } sub url { die "url() must be implemented in subclass" }

    On the surface, this might appear to simply be an interface that will be used as a base class for a set of classes. However, sometimes people get confused and simply override those methods to return data:

    package Some::User::Foo; sub user { 'bob' } sub pass { 'seKret' } sub url { '<a href="http://somesite.com/">http://somesite.com/</a +>' }

    There's really no reason for that. Make it an instance:

    my $foo = Some::User->new('Foo');

    Thus, if you need to change how things work internally, you're doing that on only one class rather than hunting through a bunch of useless subclasses.

  • Law of Demeter
  • The Law of Demeter simply states that you should only talk to your immediate friends -- using a chain of method calls to navigate an object heirarchy is begging for trouble. For example, if an office object has a manager object, an instance of that manager might have a name.
    print $office->manager->name;

    That seems all fine and dandy. Now, imagine that you have that in 20 places in your code, but in the manager class, someone changes name to full_name. Because the code using the office object was forced to walk through the object heirarchy to get at the data it actually needs, you've created fragile code. Now the manager class must support a name method to be backwards compatible (and we get to start on our big ball of mud), or every reference to it must be changed -- but we've created far too many.

    The solution is to do this:

    print $office->manager_name; # manager_name calls $manager->name

    Now, instead of hunting down all of the places where this was accessed, we've limited this call to one spot and made maintenance much easier. This can, however, lead to code bloat. Make sure you understand the tradeoffs involved.

  • Liskov substitution principle
  • While there is disagreement over what this means, this principle states (paraphrasing) that a subclass must present the same interface as its superclass. Some argue that the behavior or subclasses (or subtypes) should not change, though I feel that with proper encapsulation, this distinction goes away. For example, imagine a cash register program where a person's order is paid via a combination of credit card, check, and cash (such as when three people annoy the waiter by splitting the bill).
    foreach my $tender (@tenders) { $tender->apply($order); }

    In this case, let's assume there is a Tender::Cash superclass and subclasses along the lines of Tender::CreditCard and Tender::LetsHopeThisDoesntBounce. The credit card and check classes can be used exactly as if they were cash. Their apply() methods are probably different internally, but every method that's available for cash should be available for the subclasses and data which is returned should be identical in form. (this might be a bad example as a generic Tender interface may be more appropriate).

    Another example is HTML::TokeParser::Simple. This is a drop-in replacement for HTML::TokeParser. You don't need to change the actual code, but you can then use all of the extra nifty features built in.

Methods

  • Don't encourage promiscuous behavior
  • Hide your data, even that data which is public. Provide setters and getters for properties (accessors and mutators, if you prefer), rather than allowing people to reach into the object. Use these internally, too. You need them as much as users of your code need them.
    $object->{foo};

    This is a common idiom, but it's an example of an anti-pattern. What happens when you want to change that to an array ref? What happens when you want to use inside-out objects? What happens when you want to validate an assignment to this value?

    All of these issues and more crop up when you let people reach into the object. One of the major points of OO programming is to allow proper encapsulation of what's going on inside of the object. As soon as you let your defensive programming guard down, you're going to get bug reports. Use proper methods to handle this:

    $object->foo; $object->set_foo($foo);

  • Don't expose state if you don't have to.
  • if ($object->error) { $object->log_errors } # bad!

    Whoops! Now we have a problem. Not only does every place in the code that might want to log errors have to first check if those errors exist, your log_errors method might erroneously assume that this has been checked. Check the state inside of the method.

    sub log_errors { my $self = shift; return $self unless $self->error; $self->_log_errors; }

    Better yet, there's a good chance that you're not concerned about the error log at runtime, so you could simply specify an error log in your constructor (or have the class use a default log), and let the module handle all of that internally.

    sub connect { my $self = shift; unless ($self->_get_rss_feed) { $self->_log_errors; $self->_fetch_cached_copy; } $self; }

    In the above example, there's an error that should be noted, but since a cached copy of data is acceptable, there's no need for the program to deal with this directly. The object notes the problem internally, adopts a fallback remedy and everything is peachy.

  • Keep your data structures uniform
  • (I saw this on use.perl but I can't remember who posted it)

    Assuming that a corresponding mutator exists, accessors should return a data structure that the mutators will accept. The following must always work:

    $object->set_foo( $object->get_foo );

    Failure to do this will cause no end of grief for programmers who assume that that the object accepts the data structures that it emits.

Debugging

  • $object->as_string
  • Create a method (be cautious about overloading string conversions for this) to dump the state of an object. Many simply use YAML or Data::Dumper, but having a nice, human readable format can mean a world of difference when trying to debug a problem.

    Here's the YAML dump of a hypothetical product. Remember that, amongst other things, YAML is supposed to be human-readable.

    --- #YAML:1.0 !perl/Product bin: 19 data: category: 7 cost: 2.13 name: Shirt price: 3.13 id: 7 inv: 22 modified: 0

    Now here's hypothetical as_string() output that might be used in debugging (though you might want to tailor the method for public display).

    Product 7 Name: Shirt Category: Clothing (7) Cost: $2.13 Price: $3.13 On-hand: 22 Bin: Aisle 3, Shelf 5b (19) Record not modified

    That's easier to read and, by doing lookups on the category and bin ids, you can present output that's easier to understand.

  • Test
  • I've saved the best for last for a good reason. Write a full set of tests! One of the nicest things about tests is that you can ask someone to run them if they submit a bug report. Failing that, it's a perfect way to ensure that a bug does not return, that your objects behave as documented and that you don't have ``extra features'' that you weren't expecting.

    One of the strongest objections to OO perl is the idiomatic object constructor:

    sub new { my ($class, %data) = @_; bless \%data => $class; }

    Which can then be followed with:

    sub set_some_property { my ($self, $property) = @_; $self->{some_prorety} = $property; # (sic) return $self; } sub some_property { $_[0]->{some_property} }

    And the tests:

    ok($object->set_some_property($foo), 'Setting a property should su +cceed'); is($object->some_property, $foo, "... and fetching it should a +lso succeed");

    Because blessing a hash reference is the most common method of creating objects in Perl, we lose many of the benefits of strict. However, a proper test suite will catch issues like this and ensure that they don't recur. On a personal note, I've noticed that since I've begun testing, I sometimes forget to use strict, but my code has not been suffering for it. In fact, sometimes it's better because I frequently write code for which strict would be a hassle, but that's another example of where the rules get broken, but they're broken because the programmer knows when to break them.

    Yet another fascinating thing about tests is the freedom they give you. If you have a comprehensive test suite, you can start taking liberties with your code in a way that you haven't before. Are you having performance problems because you're using an accessor in the bottom of a nested loop? If the object is a blessed hashref, you might get quite a performance boost by just ``reaching inside'' and grabbing the data you need directly. While many will tell you this is a no-no, the reason they mention this is for maintainability. However, a good test suite will protect you against many of the maintainability problems you may face (though it still won't make fixing your encapsulation violations any easier once you are bitten).

    That last paragraph might sound a bit curious. Is Ovid really telling people it's OK to violate encapsulation, particularly after he pointed out the evils of it?

    Yes, I am saying that. I'm not recommending that, but one thing that often gets lost in the shuffle when ``paradigm'' flame wars begin is that programming is a series of compromises. Rare indeed is the programmer who has claimed that she's never compromised the integrity of her code for performance, cost, or deadline pressures. We want to have a perfect system that people will ``ooh'' and ``aah'' over, but when you see the boss coming down the hall with a worried look, you realize that the latest nasty hack is going to make its way into production. Tests, therefore, are your friend. Tests will tell you if the nasty little hack works. Tests will tell you when the nasty little hack breaks.

    Test, damn you!


CONCLUSION

Many Perl programmers, including myself, learned Perl's OO syntax without knowing much about object-oriented programming. It's worth picking up a book or two and doing some reading about OO theory and pick up some of the tricks that, upon reflection, seem so obvious. Let the object do the work for you. Hide its internals carefully and don't force the programmer to worry about the object's state. All of the guidelines above can be broken, but knowing about them and why you want to follow them will tell you when it's OK to break them.

Update: I really should have called this "Often Overlooked Object Oriented Observations". Then we could refer to this node as "'O'x5".

Cheers,
Ovid

New address of my CGI Course.