http://qs321.pair.com?node_id=1047920

ghosh123 has asked for the wisdom of the Perl Monks concerning the following question:

Hi
I have a gui-based tool written in perl tk which can run jobs in thousands. But we are trying to scale up the gui so that it can run nearly 1lac job without getting hung.
The code base is very huge and comprised of near about 60-70 module files. It uses socket connection for inter-process communication and MySQL for storing data. I need to profile this huge perl code base to know what could be the bottlenecks for running lacs of jobs and how can I overcome that ?
Can anybody please suggest me any good mechanism to know the bottlenecks and do profiling. I have come to know about Devel::NYFTProf but not quite able to understand how to use it. The gui has a launching script which in turn calls some more scripts using some modules.
I have come to know of following things but not quite sure how can they be helpful and how to find out his problem in my huge code

1.Avoid->repeated->chains->of->accssors(..) . Instead use temprorary variables.
Question is how come it will help if I avoid repeated chains of function call and use temp variable. Also how can I look in my huge code where all such chain calls are happening?

2. Use faster accessors as
Class::Accessor
-> Class::Accessor::Fast
---> Class::Accessor::Faster
----->Class::XSAccessor

3. Avoid calling subs that don't do anything. How can I detect this ? Any mechanism ?

4. Exit subs and loops early , delay initialization

return if not ... a cheap test...; return if not ... a more expensive test..; my $foo = ..initialization...; ...body of sub routine ...
5. Fixing silly code as below :
return exists $hash{$a}{$key}?$hash{$a}{$key} : undef; return $hash{$a}{$key}; # instead of above

Thanks

Replies are listed 'Best First'.
Re: how to improve the performance of a perl program
by BrowserUk (Patriarch) on Aug 05, 2013 at 17:03 UTC
    I have come to know about Devel::NYFTProf but not quite able to understand how to use it. The gui has a launching script which in turn calls some more scripts using some modules.

    What do you not understand about how to use it?

    You probably don't need to profile the gui itself, so how are you "calling" the other scripts from it?

    Use faster accessors

    The fastest accessor is the one you don't call.

    It may go against OO-dogma; but in 95% of cases, there is no good reason to use subroutines to access instance variables from within that class's methods.

    There are three main reasons formally cited for using accessors within a class's methods:

    1. To isolate the rest of that implementation from future substantial changes to the structure and/or layout of the instance data.

      This could only ever become a saving if that data layout changed beyond recognition; and if that happens, the likelihood that you would get away with not also substantially rewriting method code is almost nil.

    2. To provide centralised, single point validation of values assigned to instance variables.

      If your class is even vaguely well designed and written; it should not be possible for internally sourced assignments to instance variables to assign invalid values. Thus, re-validating those internally-sourced values for every assignment is pure overhead.

    3. People often respond to that with: "But what about values that come into a class from outside"?

      And the answer is that external inputs should be validated at the point of transition across the class boundary. Ie. Whenever an externally visible method is called; you should validate its parameters. But external code should never be directly accessing instance data, therefore there should be no such thing as externally visible accessors.

    In short. External code calls methods to perform services, not access internal data. Where service methods accept arguments; those arguments must be validated immediately; and once so validated; any values derived from those arguments that subsequently gets set into instance variables require no further validation.

    Thus, method code can safely directly access instance variables. Which in turn avoids both the overhead of accessor method calls; and centralised re-validation.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: how to improve the performance of a perl program
by tobyink (Canon) on Aug 05, 2013 at 16:52 UTC

    Re point #1, the idea is to avoid, say:

    if ($thing->position->x == 0 and $thing->position->y == 0) { ...; }

    ... which calls $thing->position twice, with something like this:

    my $pos = $thing->position; if ($pos->x == 0 and $pos->y == 0) { ...; }

    ... which only calls it once.

    package Cow { use Moo; has name => (is => 'lazy', default => sub { 'Mooington' }) } say Cow->new->name
Re: how to improve the performance of a perl program
by Tux (Canon) on Aug 05, 2013 at 15:46 UTC

    Do you consider Devel::NYTProf a viable option?


    Enjoy, Have FUN! H.Merijn

      Yes, I consider so. But I need an example to better understand it. Also can you please explain me how and why my no.1 point which is chain->of->accessors could be a problem ?
      Please also explain how can I use Class::Accessor ?
      Thanks.

        $chain->of->accessors often isn't a problem. However, if you've identified a bottleneck, you may need to consider what work is being done:

        • An inheritance lookup is done to decide which "->of" pertains to object $chain. This is a simple operation, but is more expensive than a direct subroutine call.
        • A subroutine call is executed. This involves pushing the call-frame onto the call-stack, and within the subroutine popping off items from the param stack. This is usually pretty quick, but considerably slower than looking up the value of a variable.
        • The accessor must do whatever work it must do. Perhaps it does no work other than returning a value. Maybe it's computing the 1-Billionth prime newly on each call. The cost depends entirely on how the accessor is implemented.
        • Next an object is returned so that ->accessors may be invoked on it. The return involves popping the current sub off the call-stack.
        • This process repeats for ->accessors.

        If that sounds like a lot of work, you're jumping to conclusions. If you put all that inside of a tight loop, inside of an algorithm that computes the Cartesian product of two human DNA sequences, yes... it's way too much work to be doing inside of a tight loop. If you're diving into that chain of accessors only every so often, then all the object lookup and call-stack work really fades into the background, and you maybe need to just consider how much work the individual accessors are doing internally. But until you've identified bottlenecks, it's a total waste of your time and the salary your employer pays you to just start making untested assumptions about performance, because you could be looking completely in the wrong places.

        As for your question about how to use Class::Accessor, before I explain how, let me ask you why you think you want to use it. Class::Accessor has about as much to do with code speed optimization as cruise controls have to do with drag racing. So if you do understand that Class::Accessor isn't a means to code speed optimization, and you still need it, then I suggest you read its documentation and ask a specific question about it, rather than asking "how to use" it when "how" is demonstrated right in its documentation.


        Dave

Re: how to improve the performance of a perl program
by davido (Cardinal) on Aug 05, 2013 at 19:14 UTC

    1. Ok, this one I discussed in an earlier reply.
    2. Faster accessors.... Do you know which portions of code are causing problems? Faster accessors only make sense if you've got a slow one causing you trouble. And even then, "trimming cycles" is often much less effective than "a more efficient algorithm"
    3. Yes, I would suggest avoiding calling any code that doesn't do anything. Especially if it does nothing in O(n^2) time. ;) How to detect that it's not doing anything? I guess you've got to look at what it's designed to do, ask why that's useful, and if you determine that it's not doing anything useful, stop calling it. A good regression test suite is helpful when doing that sort of refactoring.
    4. Exiting loops early is an optimization on a linear operation. Lazy initialization or lazy evaluation is a technique where you do some expensive work just in time, with the hope that maybe you never have to do it at all. If you know you've got to do it, sometimes an opposite technique of pulling as much of the work into startup time as possible can also be beneficial. Which is best for your application has to be your decision based on a lot of factors.
    5. If you're going to go about rearranging code that does work but just looks silly, especially when it's not really impacting performance, be sure that you've got good regression tests in place first, or just leave it alone.

    Dave

Re: how to improve the performance of a perl program
by Anonymous Monk on Aug 06, 2013 at 02:34 UTC

    Hi,

    Just make sure that the tests are done against real world numbers of the scale you envisage running in reality.

    Good solutions for 100 whatsits can turn into nightmares when run against the real world's 1 million whatsits.

    J.C.

A reply falls below the community's threshold of quality. You may see it by logging in.