Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses

Object Matching, speed issue

by PerlingTheUK (Hermit)
on Aug 25, 2004 at 18:22 UTC ( [id://385767] : perlquestion . print w/replies, xml ) Need Help??

PerlingTheUK has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,
I have an object oriented data structure with singleton objects. These objects hold a number of database rows as as a kind of parent. I now want to create a function that matches such objects individually. For eaxample for a table with drinks I would like to get all drinks, that are without sugar but are black.
My database contains a couple of thoiusands of such objects. I have a function calles match that receives a hash with the column name and the content.
my $ret = $src_drinks->match( { sugar => "no", color => "black" } );
does what an equal SQL query
SELELCT * FROM drinks WHERE sugar LIKE "no" AND color LIKE "black";
would do.
My match function is as follows:
# sub match # return all objects that match the requested options sub match{ my ( $self, $hashref ) = @_; my @keys = keys %{ $hashref }; if ( ( @keys == 1 ) && defined( $self->{ _matched }->{ $keys[0]."@". +$hashref->{ $keys[0] } } ) ) { return $self->{ _matched }->{ $keys[0]."@".$hashref->{ $keys[0] } + }; } # STEP 2: now go through all objects and return ptr to list of match +ed objects or 0, if none found. my @ret; foreach my $obj ( @{ $self->get_data() } ) { my $equals = 1; foreach my $key ( @keys ){ my $s = "\$obj->get_".$key."()"; $equals = 0 unless ( $hashref->{ $key } eq eval( $s ) ); } push ( @ret, $obj ) if ( $equals == 1 ); } if ( @ret == 0 ){ return 0; } $self->{ _matched }->{ $keys[0]."@".$hashref->{ $keys[0] } } = \@ret + if ( @keys == 1 ); return \@ret; }
This is extremely slow. Due to hundreds of calls to eval. Does anyone have any ideas how to speed things up a bit, without changing the whole data structure?
As you can see I have speeded up the whole process by saving all "single-column" matches, this is currently not memory critical but might become so I might really need to get anything faster to come around getting rid of this performance bit either.


Replies are listed 'Best First'.
Re: Object Matching, speed issue
by dragonchild (Archbishop) on Aug 25, 2004 at 18:41 UTC
    Use DBD::SQLite - you are replicating the functionality of a database, so use one. *shrugs*

    We are the carpenters and bricklayers of the Information Age.

    Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

    I shouldn't have to say this, but any code, unless otherwise stated, is untested

      I finally bit the bullet and started using SQLite on a project of mine, and it's awesome. I couldn't recommend it more.
        I considered using it, and do so for some parts of my code. this data bit I described actually changes a lot, and a user might have about 50 to 500 different files where the data is stored. Within the files there are different tables and a lot of the calculations I do get very complicated once i have to restore it from the database. Therefore I have not gone this step now. But might. Yet this speed issue is more pressing than my intention to port the whole thing to SQLite.
        Thanks for the advice.

Re: Object Matching, speed issue
by revdiablo (Prior) on Aug 25, 2004 at 19:55 UTC

    The call to eval is unnecessary. Try this instead (no promises on speeding up the code, as I haven't profiled it, but at least this eliminates the eval):

    foreach my $key ( @keys ){ my $meth = "get_$key"; $equals = 0 unless ( $hashref->{ $key } eq $obj->$meth() ); }

    Update: a simple benchmark shows that this way is indeed much faster than using eval. Again, not having profiled the code, I'm not sure how representative this is, but it's interesting.

    Here's the benchmark:

    #!/usr/bin/perl use strict; use warnings; use Benchmark qw(timethese cmpthese); { package foo; sub get_foo { "@_" }; sub get_bar { "@_" } } my $puk = sub { for (qw(foo bar)) { my $code = "foo->get_$_(qq(bar))"; eval $code; } }; my $rev = sub { for (qw(foo bar)) { my $meth = "get_$_"; foo->$meth("bar"); } }; cmpthese timethese(-2, { rev => $rev, puk => $puk, });

    And here are the results:

    Benchmark: running puk, rev for at least 2 CPU seconds... puk: 2 wallclock secs ( 2.08 usr + 0.00 sys = 2.08 CPU) @ 16 +447.12/s (n=34210) rev: 3 wallclock secs ( 2.08 usr + 0.00 sys = 2.08 CPU) @ 21 +7107.21/s (n=451583) Rate puk rev puk 16447/s -- -92% rev 217107/s 1220% --
      I was not aware you can do that. Thank you, great help.

Re: Object Matching, speed issue
by fizbin (Chaplain) on Aug 25, 2004 at 22:07 UTC
    Well, revdiablo has already given you one way to do it using string sub references. (Note that in a "use strict" environment, you may need to put "no strict subs;" somewhere in that code to have it work)

    Here's another way, which is something to consider in general if you find yourself running eval inside a big loop. Basically, build up a string that will eval to a sub reference, and then use the sub reference in the loop. So STEP 2 becomes:

    my $subtext = "sub {\n"; $subtext .= join (" and ", map { '$_->get_'.$_.'() eq "'.quotemeta($hashref->{ $_ }).'"' } @key +s); $subtext .= ";\n}"; my $selectsub = eval($subtext); my @ret = grep(&$selectsub, @{ $self->get_data() });
    And that's it. Note that use of a sub reference also lets you easily use grep and not have to build up a return list yourself.
    -- @/=map{[/./g]}qw/.h_nJ Xapou cets krht ele_ r_ra/; map{y/X_/\n /;print}map{pop@$_}@/for@/
      Note that in a "use strict" environment, you may need to put "no strict subs;" somewhere in that code to have it work

      Actually, no, you don't. strict doesn't care about softrefs for method calls.