Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Storable::freeze performance problem

by Smoothhound (Monk)
on Jun 24, 2005 at 16:23 UTC ( [id://469753]=perlquestion: print w/replies, xml ) Need Help??

Smoothhound has asked for the wisdom of the Perl Monks concerning the following question:

Esteemed monks,

I've run into a frustrating performance problem using Storable.

I have an application that uses Storable to freeze/thaw large hashrefs as scalars in a database. The large hashref in question holds an arbitrary number of pieces of information about an arbitrary number of job applicants in the form:

$data->{$data_id}->{$applicant_id} = $value;

The hashref is then used to generate reports on the data. The idea being that subsequent runs of the report don't need to retrieve all the data again.

I get significant performance improvements compared to fetching the data every time for smaller sets of data but once the dataset reaches a certain size (approximately 3 $data_id's and 7500 $applicant_id's) the performance degenerates horribly.

A dprofpp trace shows that it's the freezing and not the thawing that's the problem, and strangely, the problem only occurs if the freeze takes place after a thaw in the same run. And also the problem does not occur if I store/retrieve to a file, unfortunately this is not an option in our production environment.

On the face of it it seems like some sort of memory bottleneck but the development machine has plenty and never swaps. I had a look at the XS code but it's been a long time since I looked any C code and it was way over my head.

Basically, has anyone come across this before? or is there anything I can try to improve things?

Many Thanks

Replies are listed 'Best First'.
Re: Storable::freeze performance problem
by PodMaster (Abbot) on Jun 24, 2005 at 17:03 UTC
    I've run into the same problem (I believe), here's how I discovered it I haven't been able to workout why it happens, but it's got something to do with my compiler setup and/or my perl version (5.8.4 and less).

    It only happens with Storable 2.15 and up, so downgrading to Storable 2.14 seems to fix it, as does using (Storable-2.15+ with) perl 5.8.5 and up.

    Corion says maybe it is a memory leak/allocation problem that was fixed in 5.8.5, he might be right.

    MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
    I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
    ** The third rule of perl club is a statement of fact: pod is sexy.

      Thanks Pod, I ran into this same issue not to long ago and had to revert to the older version to fix it. I never had time to go back and investigate the issue, my guess was that the 2.15 or later versions had changes the recursive loop in some way that just did not jive with my app's data structure. ++ * 10 if I could.


      -Waswas

      Bless you PodMaster! reverting to 2.14 did the trick, it will be sometime before we move to 5.8.5+ here.

      Thawing, manipulating, presenting the report then freezing again for 3 data items for 20000 applicants now takes ~3 seconds instead of ~19!

      Thanks again, I've been banging my head against this one for a couple of days

Re: Storable::freeze performance problem
by Xaositect (Friar) on Jun 24, 2005 at 16:58 UTC

    I'm afraid I can't really answer to the performance question, but on the off chance this is useful...

    I ran into a situation a while back where I couldn't use Storable on a particular platform. I did some reasearch into other freeze/thaw methods, and ended up using Data::Dumper . Freeze by saving the Dumper output, and thaw by eval'ing it again. To my surprise, there are actually a number of useful configuration settings for Data::Dumper with this sort of freeze/thaw in mind. ($Data::Dumper::Freezer for example)

    Obviously, data size can be an issue with this solution, so check out $Data::Dumper::Indent. Again, I can't speak to performance, but from the doc: "The Data::Dumper module is a dual implementation, with almost all functionality written in both pure Perl and also in XS ('C'). Since the XS version is much faster, it will always be used if possible."


    Xaositect - Whitepages.com
Re: Storable::freeze performance problem
by mrpeabody (Friar) on Jun 24, 2005 at 18:48 UTC
    I get significant performance improvements compared to fetching the data every time for smaller sets of data but once the dataset reaches a certain size (approximately 3 $data_id's and 7500 $applicant_id's) the performance degenerates horribly.
    Could Storable be switching from the C to the Perl implementation at that boundary? That would explain a sudden dropoff in speed.

    Even without understanding the C code, it shoud be possible to figure out which implementation is being used when benchmarking.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://469753]
Approved by Limbic~Region
Front-paged by tlm
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (4)
As of 2024-04-20 02:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found