Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic

Refactor method calls or not?

by Ovid (Cardinal)
on Jan 18, 2002 at 22:51 UTC ( [id://139911] : perlmeditation . print w/replies, xml ) Need Help??

Recently, one of the projects that I have been working on has many duplicate method calls. I have refactored out much of the duplicate code that remains, but they are still pretty much the same. Here are two examples:

sub add_company { my ( $self, $data ) = @_; my $table = 'company'; my $data = $self->_generic_insert( $data, $table ); $self->{ _dbh }->commit if ! $self->{ _error }; return $data; } sub add_financial_diary { my ( $self, $data ) = @_; my $table = 'financialDiary'; my $data = $self->_generic_insert( $data, $table ); $self->{ _dbh }->commit if ! $self->{ _error }; return $data; }

Note: I don't feel the need for transactions here, but several other methods need transactions and they share the same database handle so I do a commit. Any reason to set up a separate handle that doesn't require transactions?

These are basically the same method, with the exception of the method and table names. I'm thinking about switching this around to stuff the method names in a hash with the value being the table. Then, I'll create an AUTOLOAD routine that traps the name, autogenerates the method or throws an exception if the method is not found. There is no need to install the method in the symbol table as these methods are not called more than once per program run.

I am aware that this will likely slow down the performance of the program, but I was thinking that this will significantly reduce the size of the program and might make maintenance easier. There are several places where I could use this technique. Is this too trivial to really benefit from AUTOLOAD? WWMD? (What Would Monks Do?).


Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

Replies are listed 'Best First'.
Re: Refactor method calls or not?
by tadman (Prior) on Jan 18, 2002 at 23:11 UTC
    This might be crazy, but you could use a closure to generate the required subs and then load them into a hash. Something like this, perhaps:
    # Define appropriate "methods" my @dispatch_methods = qw [ company financialDiary ]; # Routine to generate an anonymous sub using closure sub dispatch_add { my ($what) = @_; return sub { my ( $self, $data ) = @_; my $table = $what; my $data = $self->_generic_insert( $data, $table ); $self->{ _dbh }->commit if ! $self->{ _error }; return $data; }; } # Build the dispatch table, trying to minimize error sub new { # Usual stuff... my $self = bless ... # : $self->{dispatch} = { map { $_ => dispatch_add{$_} } @dispatch_methods }; }
    Then you could call them along the lines of:
    Of course, this might be completely nuts.

    Don't get me wrong. AUTOLOAD is fun, too. A hybrid approach would merely interface with the internal {dispatch} structure and call the appropriate function. This way they are pre-cached, and pre-validated. If there is no defined dispatch, AUTOLOAD can warn/die/carp appropriately.
Re: Refactor method calls or not?
by perrin (Chancellor) on Jan 18, 2002 at 23:24 UTC
    No need to obscure things with AUTOLOAD.
    %ALLOWED_TABLES = map {$_ => 1} qw(company financialDiary); sub add_data { my ( $self, $table, $data ) = @_; die "bad table!" unless $ALLOWED_TABLES{$table}; my $data = $self->_generic_insert( $data, $table ); $self->{ _dbh }->commit if ! $self->{ _error }; return $data; }
    Of course you have to change your method calls, and maybe this is a problem for you.

    Regarding transactions, I think the transaction control belongs in the code that is calling these methods, not in the methods themselves. These data access methods don't know anything about the context they are being called in.

    UPDATE: fixed array name mismatch and changed to hash

      perrin wrote:

      Regarding transactions, I think the transaction control belongs in the code that is calling these methods, not in the methods themselves.

      There are some great ideas being tossed around here, but I want to address this issue specifically. In this system, I have what I think of as four layers:

      1. HTML Templates (via Template Toolkit)
      2. Main Programs
      3. Database API
        • Generic database parent class for all subclasses
        • Security subclass
        • Modify subclass (this is anything that changes the database and it has its own db user)
        • Select subclass (anything that only retrieves data)
      4. The actual database

      The idea here was to have the main programs grab or update data and pass the results to the HTML template, but have none of the programs allowed to touch the database directly. Item 3, the database API, is the database "gatekeeper", if you will. We have so many old projects where database code was strewn throughout the system that I felt grouping everything like this would be much safer. Am I wrong in my thinking? By having the calling code (item 2 above) handle the commit, I then move database specific code out of the designated layer. This doesn't seem appropriate, but I'd be happy to hear counter-arguments.


      Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

        That's a similar architecture to what we did at eCircles. We decided to create logical database "requests" that mapped to one or more stored procedure calls with explicit input and output variables.

        This means that the application code doesn't know anything about the database - it only knows about these database requests, which allowed us to decouple the applications from the database somewhat.

        As for transactions, with Sybase at least we managed to never have a transaction that spanned multiple database requests. I realize that this may not always be possible, but keeping transactions short helps tremendously on general throughput for the database.


        I've had the same thoughts about this. However, to make database commits you have to know the big picture, and your data access objects don't.

        One approach would be to make a set of "action" objects that are allowed to manage transactions and call the data objects, but have no web-specific code in them. This would be similar to what the Java folks call a "session facade". In Java this is done because EJB performance sucks rocks, but in this case it would simply be an organizational tool.

      There is no $spoon, er.. $type :) Did you mean $table there?
      Your grep either returns all values (if $type is true), or none at all (if $type is false). And die is evil outside the main program.
      I'd also use a lookup hash to avoid having to iterate over the array over and over.
      use Carp; ... my %tables; @tables{ qw/foo bar/ } = (); sub ... { ... croak 'Disallowed table' unless exists $tables{$table); ... }

      2;0 juerd@ouranos:~$ perl -e'undef christmas' Segmentation fault 2;139 juerd@ouranos:~$

        Thanks for catching those. I should have proofread before posting. And of course that should have been a hash. Don't know what I was thinking.

        The die is just a stand-in for whatever Ovid uses for exception handling.

      You don't need to change method calls, just make the old methods wrappers for the new super-method. This way the code stays readable (and you don't need to make many changes) but the methods do not duplicate code. Mine is the voice if INexperience though. :)
        Unless Ovid thinks he'll have to go back and differentiate these methods again in the future, changing the calls to the method is probably not much more difficult than a search and replace in the calling code blocks. If he has already bounds-tested and otherwise proofed the data elements, a more generic insert/update method is called for-- and it will be faster to change the calling code than to refactor a long list of unique methods to call the generic method. If he has *not* proofed his values, though, he might want to leave these routines with unique names so that he can validate at this point-- which is probably too late {grin}.

(jeffa) Re: Refactor method calls or not?
by jeffa (Bishop) on Jan 18, 2002 at 23:53 UTC
    I like dragonchild's answer, but here is yet another way:
    # untested sub add_company { return gen_add('company',@_); } sub add_financial_diary { return gen_add('financialDiary',@_); } sub gen_add { my ($table,$self,$data) = @_; my $return = $self->_generic_insert( $data, $table ); $self->{ _dbh }->commit if ! $self->{ _error }; return $return; }


    (the triplet paradiddle with high-hat)
      The corollary to that, which uses autoloading, is:
      { my %calls = qw( company company financial_diary financialDiary ); AUTOLOAD { my ($meth) = $AUTOLOAD =~ /.*::(.*)/; if (my $table = $calls{$meth}) { *{"add_$meth"} = sub { my $self = shift; $self->gen_add($table, @_); }; goto &{"add_$meth"}; } require Carp; Carp::croak "Unknown table action '$meth'"; } }

      Jeff[japhy]Pinyan: Perl, regex, and perl hacker.
      s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

Re: Refactor method calls or not?
by dragonchild (Archbishop) on Jan 18, 2002 at 23:29 UTC
    If you can, put the onus on the program calling this code. You're suffering from, what it looks like, the need to explicitly say what variables you can work with. Why not just have a generic add function that validates its input?
    sub generic_add { my $self = shift; my ($table, $data) = @_; return undef unless $self->is_valid_table($table); $data = $self->generic_insert($data, $table); return undef if $self->{_error}; $self->{_dbh}->commit; return $data; }

    We are the carpenters and bricklayers of the Information Age.

    Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

Re (tilly) 1: Refactor method calls or not?
by tilly (Archbishop) on Jan 19, 2002 at 11:03 UTC
    You could also create and install these in the package by assigning closures to typeglobs.
    sub create_add_to_table { my $table = shift; sub { my ( $self, $data ) = @_; my $data = $self->_generic_insert( $data, $table ); $self->{ _dbh }->commit if ! $self->{ _error }; return $data; }; } *add_company = create_add_to_table('company'); *add_financial_diary = create_add_to_table("financialDiary");
Re: Refactor method calls or not?
by dws (Chancellor) on Jan 19, 2002 at 02:03 UTC
    Is this too trivial to really benefit from AUTOLOAD?

    AUTOLOAD-based schemes can work fine until someone else using the code tries to debug it. After watching a lot of time get burned up as people struggle through figuring out how to set a breakpoint when something in the bowels gets autoloaded, I now avoid using AUTOLOAD near main-line business logic, lest any savings get overwhelmed by maintenance cost.


Re: Refactor method calls or not?
by trs80 (Priest) on Jan 19, 2002 at 05:56 UTC
    I am going to advocate for creating each method name explicit
    for the reason of future change. If you find that method
    needs to be changed it will already physically exist and
    you will just have to modify it. Now I know it isn't a big deal
    to add a method in the future, but if you can do it at this
    stage I think you will further ahead in the long run.
    I think AUTOLOAD is a bad way to go on this one. I had
    a discussion with some other proprammers recently about
    using AUTOLOAD on something similar and it frowned upon
    for maintenance and speed issues.
    If there is code already using these methods I would also
    suggest code like this:
    sub add_company { my ($self,$data) = shift; $self->gen_add($data,'company'); } sub add_financial_diary { my ($self,$data) = shift; $self->gen_add($data,'financialDiary'); } sub gen_add { my ($self,$data,$table) = @_; my $return = $self->_generic_insert( $data, $table ); $self->{ _dbh }->commit if ! $self->{ _error }; return $return; }
    This code would allow you move the gen_add method into
    another inherited class if you need to and not break the
    code. By having $self and $data available you
    also cut down on code that needs to be written if you
    determine the method needs to not call the gen_add method.

Re: Refactor method calls or not?
by hding (Chaplain) on Jan 19, 2002 at 03:23 UTC

    Somewhat offtopic, while I regret that I unfortunately don't have a lot of insight into the Perl solution to this problem, as the (or one of the) resident Common Lisp advocate (or troll, if you wish :-) this is exactly the kind of thing we Lispers like our macro system for. It's fairly easy to write a CL macro that takes the method name and table name and creates an appropriate method, then just call it twice to get the specific methods that you need (and of course, it would still be sitting around if more were needed in the future).

      If I'm not completely misunderstanding you, that's what could be done with closures in Perl - as mentioned in the first reply on this node by tadman++.

      Makeshifts last the longest.

        It's not exactly the same. For example, there would be no need to go through any contortions for dispatching and so forth. Two calls to the macro would create two normal functions called add_company and add_financial_data, which one would then just use as usual.

        E.g. the Lisp code might look something like this (first the macro is defined, then it is called twice to create the methods, then a couple of calls to the methods).:

        (defmacro make-add-method (method-name table-name) `(defmethod ,method-name ((self my-class) data) (let ((returned-data (generic-insert self data ,table-name))) (unless (check-error self) (commit (database-handle self))) returned-data))) (make-add-method 'add-company "company") (make-add-method 'add-financial-diary "financialDiary") (add-company my-object company-data) (add-financial-diary my-object financial-data)

        The macro does indeed work something like the method of using closures in the previous node - it takes a template for the code that we want (the form following the backquote) and fills in the parts that we want to vary (the forms introduced by ,). There's no need for any of the dispatching contortions, though; it's just as though we typed in the methods ourselves. While Lisp macros are great for many more things than this, they work well for this sort of thing as well, where pieces of code have very similar form, which can then be easily and naturally abstracted out.

Re: Refactor method calls or not?
by Aristotle (Chancellor) on Jan 20, 2002 at 14:41 UTC

    Of all the replies, ++tilly's seems both closest to the original goal as well cleanest. If you really have a lot of such methods, I'd add a hash that maps add_financial_diary => 'financialDiary' so that a foreach in BEGIN() could generate these, but of course that's an obvious detail.

    On the other hand I wouldn't put AUTOLOAD aside so soon despite all the advocacy against it. Since you mention that you typically only need these methods once or twice in a program's runtime, it seems an obvious idea to skip installing the methods in the symbol table altogether, in which case I find it can be written pretty cleanly. Imagine japhy's code here, but with the innermost if block replaced with something like (where _generic_add() directly does the job rather than generating a closure of course):

    my $table = $calls{$meth} || croak "Unknown table action '$meth'"; $self->_generic_add($table, @_);

    It is easy to substitute this for a loop in BEGIN() that installs routines which call _generic_add() if you later need more performance - and the only place that changes is a shift of code from AUTOLOAD to BEGIN, the rest of the module remains untouched, including the generic add routine and the "dispatch" hash.

    Makeshifts last the longest.