perlmeditation
Ovid
object
<p><a name="__index__"></a></p>
<!-- INDEX BEGIN -->
<ul>
<li><a href="#name">NAME</a></li>
<li><a href="#synopsis">SYNOPSIS</a></li>
<li><a href="#problems">PROBLEMS</a></li>
<ul>
<li><a href="#useless_oo">Useless OO</a></li>
<li><a href="#object_heirarchy">Object Heirarchy</a></li>
<li><a href="#methods">Methods</a></li>
<li><a href="#debugging">Debugging</a></li>
</ul>
<li><a href="#conclusion">CONCLUSION</a></li>
</ul>
<!-- INDEX END -->
<hr />
<h1><a name="name">NAME</a></h1>
<p>Often Overlooked Object Oriented Programming Guidelines</p>
<hr />
<h1><a name="synopsis">SYNOPSIS</a></h1>
<p>The following is not about how to write OO code in Perl. There's plenty of
nodes covering that topic. Instead, this is a general list of tips that I
like to keep in mind when I'm writing OO code. It's not exhaustive, but it
does cover a number of areas that I see many people (including myself), get
wrong or overlook.</p>
<readmore>
<hr />
<h1><a name="problems">PROBLEMS</a></h1>
<h2><a name="useless_oo">Useless OO</a></h2>
<ul>
<li><strong><a name="item_don_27t_use_what_you_don_27t_need_2e">Don't use what you don't need.</a></strong><br />
</li>
Don't use OO if you don't need it. No sense in creating an object if there
is nothing to encapsulate.
<code>
sub new {
my ($class,%data) = @_;
return bless \%data, $class;
}</code>
<p>This constructor is not unusual, but it's suggestive of a useless use
of OO. A good example of this is [cpan://Acme::Playmate] (er, maybe not the
<em>best</em> example). The module is comprised of a constructor. That's it. And
here's the documented usage:</p>
<code>
use Acme::Playmate;
my $playmate = new Acme::Playmate("2003", "04");
print "Details for playmate " . $playmate->{ "Name" } . "\n";
print "Birthdate" . $playmate->{ "BirthDate" } . "\n";
print "Birthplace" . $playmate->{ "BirthPlace" } . "\n";</code>
<p>Regardless of whether or not you feel this is a useful module, there's nothing
OO about it. In fact, with the exception of methods this module inherits from
<code>UNIVERSAL::</code>, it has no methods other than the constructor. All it does is
return a data structure that just happens to be blessed (the jokes are
obvious; we don't need to go there).</p>
<p>Of course, this is merely an <code>Acme::</code> module, so discussing how well a joke
conforms to good programming practices is probably not warranted, but read through
[id://91080|Damian Conway's 10 Rules for When to Use OO] to get a good feel
for when OO is appropriate.</p>
</ul>
<h2><a name="object_heirarchy">Object Heirarchy</a></h2>
<ul>
<li><strong><a name="item_don_27t_subclass_simply_to_alter_data">Don't subclass simply to alter data</a></strong><br />
</li>
Subclass when you need a more specific instance of a class, not just to change
data. If you do that, you simply want an instance of the object, not a new
class. Subclass to alter or add behavior. While I don't see this problem a lot, I see it enough that it merits discussion.</p>
<code>
package Some::User;
sub new {
bless {}, shift;
}
sub user { die "user() must be implemented in subclass" }
sub pass { die "pass() must be implemented in subclass" }
sub url { die "url() must be implemented in subclass" }</code>
<p>On the surface, this might appear to simply be an interface that will be used
as a base class for a set of classes. However, sometimes people get confused
and simply override those methods to return data:</p>
<code>
package Some::User::Foo;
sub user { 'bob' }
sub pass { 'seKret' }
sub url { '<a href="http://somesite.com/">http://somesite.com/</a>' }</code>
<p>There's really no reason for that. Make it an instance:</p>
<code>
my $foo = Some::User->new('Foo');</code>
<p>Thus, if you need to change how things work internally, you're doing that on
only one class rather than hunting through a bunch of useless subclasses.</p>
<li><strong><a name="item_law_of_demeter">Law of Demeter</a></strong><br />
</li>
[http://c2.com/cgi/wiki?LawOfDemeter|The Law of Demeter] simply states that
you should only talk to your immediate friends -- using a chain of method
calls to navigate an object heirarchy is begging for trouble. For example, if
an office object has a manager object, an instance of that manager might have
a name.
<code>
print $office->manager->name;</code>
<p>That seems all fine and dandy. Now, imagine that you have that in 20 places
in your code, but in the manager class, someone changes <code>name</code> to
<code>full_name</code>. Because the code using the office object was forced to walk
through the object heirarchy to get at the data it actually needs, you've
created fragile code. Now the manager class must support a <code>name</code> method
to be backwards compatible (and we get to start on our big ball of mud), or
every reference to it must be changed -- but we've created far too many.</p>
<p>The solution is to do this:</p>
<code>
print $office->manager_name; # manager_name calls $manager->name</code>
<p>Now, instead of hunting down all of the places where this was accessed,
we've limited this call to one spot and made maintenance much easier. This can, however, lead to code bloat. Make sure you understand the tradeoffs involved.</p>
<li><strong><a name="item_liskov_substitution_principle">Liskov substitution principle</a></strong><br />
</li>
While there is disagreement over what this means, this principle states
(paraphrasing) that a subclass must present the same interface as its
superclass. Some argue that the behavior or subclasses (or subtypes) should
not change, though I feel that with proper encapsulation, this distinction
goes away. For example, imagine a cash register program where a person's
order is paid via a combination of credit card, check, and cash (such as when
three people annoy the waiter by splitting the bill).
<code>
foreach my $tender (@tenders) {
$tender->apply($order);
}</code>
<p>In this case, let's assume there is a <code>Tender::Cash</code> superclass and
subclasses along the lines of <code>Tender::CreditCard</code> and
<code>Tender::LetsHopeThisDoesntBounce</code>. The credit card and check classes can be
used exactly as if they were cash. Their <tt>apply()</tt> methods are probably different
internally, but every method that's available for cash should be available for
the subclasses and data which is returned should be identical in form. (this
might be a bad example as a generic <code>Tender</code> interface may be more
appropriate).</p>
<p>Another example is [cpan://HTML::TokeParser::Simple]. This is a drop-in
replacement for [cpan://HTML::TokeParser]. You don't need to change the
actual code, but you can then use all of the extra nifty features built in.</p></ul>
<h2><a name="methods">Methods</a></h2>
<ul>
<li><strong><a name="item_don_27t_encourage_promiscuous_behavior">Don't encourage promiscuous behavior</a></strong><br />
</li>
Hide your data, even that data which is public. Provide setters and getters
for properties (accessors and mutators, if you prefer), rather than allowing
people to reach into the object. Use these internally, too. You need them as
much as users of your code need them.
<code>
$object->{foo};</code>
<p>This is a common idiom, but it's an example of an anti-pattern. What happens
when you want to change that to an array ref? What happens when you want to
use inside-out objects? What happens when you want to validate an
assignment to this value?</p>
<p>All of these issues and more crop up when you let people reach into the
object. One of the major points of OO programming is to allow proper
encapsulation of what's going on inside of the object. As soon as you let your defensive programming guard down, you're going to get bug reports. Use proper methods to handle this:</p>
<code> $object->foo;
$object->set_foo($foo);</code>
<p></p>
<li><strong><a name="item_don_27t_expose_state_if_you_don_27t_have_to_2e">Don't expose state if you don't have to.</a></strong><br />
</li>
<code>
if ($object->error) {
$object->log_errors
} # bad!</code>
<p>Whoops! Now we have a problem. Not only does every place in the code that
might want to log errors have to first check if those errors exist, your
<code>log_errors</code> method might erroneously assume that this has been checked.
Check the state inside of the method.</p>
<code>
sub log_errors {
my $self = shift;
return $self unless $self->error;
$self->_log_errors;
}</code>
<p>Better yet, there's a good chance that you're not concerned about the error
log at runtime, so you could simply specify an error log in your constructor
(or have the class use a default log), and let the module handle all of that
internally.</p>
<code> sub connect {
my $self = shift;
unless ($self->_get_rss_feed) {
$self->_log_errors;
$self->_fetch_cached_copy;
}
$self;
}</code>
<p>In the above example, there's an error that should be noted, but since a
cached copy of data is acceptable, there's no need for the program to deal
with this directly. The object notes the problem internally, adopts a
fallback remedy and everything is peachy.</p>
<li><strong><a name="item_keep_your_data_structures_uniform">Keep your data structures uniform</a></strong><br />
</li>
(I saw this on use.perl but I can't remember who posted it)
<p>Assuming that a corresponding mutator exists, accessors should return a data structure that the mutators will accept. The following must always work:</p>
<code>
$object->set_foo( $object->get_foo );</code>
<p>Failure to do this will cause no end of grief for programmers who assume that
that the object accepts the data structures that it emits.</p>
</ul>
<h2><a name="debugging">Debugging</a></h2>
<ul>
<li><strong><a name="item__24object_3eas_string"><code>$object->as_string</code></a></strong><br />
</li>
Create a method (be cautious about overloading string conversions for this) to
dump the state of an object. Many simply use [cpan://YAML] or
[cpan://Data::Dumper], but having a nice, human readable format can mean a
world of difference when trying to debug a problem.
<p>Here's the [cpan://YAML] dump of a hypothetical product. Remember that,
amongst other things, YAML is supposed to be human-readable.</p>
<code>
--- #YAML:1.0 !perl/Product
bin: 19
data:
category: 7
cost: 2.13
name: Shirt
price: 3.13
id: 7
inv: 22
modified: 0</code>
<p>Now here's hypothetical <code>as_string()</code> output that might be used in
debugging (though you might want to tailor the method for public display).</p>
<code>
Product 7
Name: Shirt
Category: Clothing (7)
Cost: $2.13
Price: $3.13
On-hand: 22
Bin: Aisle 3, Shelf 5b (19)
Record not modified</code>
<p>That's easier to read and, by doing lookups on the category and bin ids, you
can present output that's easier to understand.</p>
<li><strong><a name="item_test">Test</a></strong><br />
</li>
I've saved the best for last for a good reason. Write a full set of tests!
One of the nicest things about tests is that you can ask someone to run them if
they submit a bug report. Failing that, it's a perfect way to ensure that a
bug does not return, that your objects behave as documented and that you don't
have ``extra features'' that you weren't expecting.
<p>One of the strongest objections to OO perl is the idiomatic object
constructor:</p>
<code> sub new {
my ($class, %data) = @_;
bless \%data => $class;
}</code>
<p>Which can then be followed with:</p>
<code> sub set_some_property {
my ($self, $property) = @_;
$self->{some_prorety} = $property; # (sic)
return $self;
}
sub some_property { $_[0]->{some_property} }</code>
<p>And the tests:</p>
<code>
ok($object->set_some_property($foo), 'Setting a property should succeed');
is($object->some_property, $foo, "... and fetching it should also succeed");</code>
<p>Because blessing a hash reference is the most common method of creating
objects in Perl, we lose many of the benefits of strict. However, a proper
test suite will catch issues like this and ensure that they don't recur.
On a personal note, I've noticed that since I've begun testing, I sometimes
forget to use <code>strict</code>, but my code has not been suffering for it. In fact,
sometimes it's better because I frequently write code for which <code>strict</code>
would be a hassle, but that's another example of where the rules get broken,
but they're broken because the programmer knows when to break them.</p>
<p>Yet another fascinating thing about tests is the freedom they give you. If you
have a comprehensive test suite, you can start taking liberties with your code
in a way that you haven't before. Are you having performance problems because
you're using an accessor in the bottom of a nested loop? If the object is a
blessed hashref, you might get quite a performance boost by just ``reaching
inside'' and grabbing the data you need directly. While many will tell you this
is a no-no, the reason they mention this is for maintainability. However, a
good test suite will protect you against many of the maintainability problems
you may face (though it still won't make fixing your encapsulation violations
any easier once you are bitten).</p>
<p>That last paragraph might sound a bit curious. Is [Ovid] <em>really</em> telling
people it's OK to violate encapsulation, particularly after he pointed out the
evils of it?</p>
<p>Yes, I am saying that. I'm not recommending that, but one thing that often
gets lost in the shuffle when ``paradigm'' flame wars begin is that programming
is a series of compromises. Rare indeed is the programmer who has claimed that
she's never compromised the integrity of her code for performance, cost, or
deadline pressures. We <em>want</em> to have a perfect system that people will ``ooh''
and ``aah'' over, but when you see the boss coming down the hall with a worried
look, you realize that the latest nasty hack is going to make its way into
production. Tests, therefore, are your friend. Tests will tell you if the
nasty little hack works. Tests will tell you when the nasty little hack breaks.</p>
<p>Test, damn you!</p>
<p></p></ul>
<p>
</p>
<hr />
<h1><a name="conclusion">CONCLUSION</a></h1>
<p>Many Perl programmers, including myself, learned Perl's OO syntax without
knowing much about object-oriented programming. It's worth picking up a book
or two and doing some reading about OO theory and pick up some of the tricks
that, upon reflection, seem so obvious. Let the object do the work for you. Hide its internals carefully and don't force the programmer to worry about the object's state. All of the guidelines above can be broken, but knowing about them and why you want to follow them will tell you when it's OK to break them.</p>
<p><strong>Update:</strong> I really should have called this "Often Overlooked Object Oriented Observations". Then we could refer to this node as "'O'x5".</p>
<div class="pmsig"><div class="pmsig-17000">
<p>Cheers,<br />
<a href="/index.pl?node=Ovid&lastnode_id=1072">Ovid</a></p>
<p><small>New address of <a href="http://users.easystreet.com/ovid/cgi_course/">my CGI Course</a>.</small></p>
</div></div>