Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

nytprof Profiler gives diverse results

by boleary (Scribe)
on Apr 04, 2020 at 15:46 UTC ( [id://11115047]=perlquestion: print w/replies, xml ) Need Help??

boleary has asked for the wisdom of the Perl Monks concerning the following question:

I am running nytprof to profile some test code on a recent windows 10 laptop
It has been very helpful to fix some subroutines to make it all run faster

The strange thing is with the test I am running it can finish in 11 seconds or sometimes 30 seconds...
But the test is exactly the same.
When I look at the profile, it looks the same except the time for each event just doubles
I'm wondering if my laptop is going into a reduced power mode?
I can hear the fan changing speeds as I run....

Would anyone care to weigh in?

Fast 12.4 seconds

190697 1 1 3.05s 3.98s main::equals_contains_or_m +atches 190697 4 3 1.86s 6.93s main::extended_string_matc +h_check 480 1 1 1.41s 8.57s main::assignSymSecToPins 707333 159 20 1.41s 1.41s main::CORE:match (opcod +e) 480293 45 10 1.04s 1.04s main::CORE:regcomp (opco +de) 23425 1 1 672ms 986ms main::newStripLeadingZeros 2953 6 2 294ms 687ms main::stringContainsMatchFro +mCset 279247 53 11 247ms 247ms main::CORE:subst (opcode +) 762 5 2 236ms 252ms main::simple_dpair_find 10251 1 1 130ms 216ms main::isValidRegex 6372 1 1 123ms 162ms YAML::Tiny::_load_scalar 516 1 1 119ms 321ms YAML::Tiny::_load_hash 6 1 1 109ms 601ms main::sortSectionSignalList 647 1 1 86.4ms 124ms main::sortAtomsNumerically 10251 1 1 71.0ms 290ms main::fixLonelyQuantCharsI +nRegEx

Medium 16 seconds

190697 1 1 3.73s 4.92s main::equals_contains_or_m +atches 190697 4 3 2.36s 8.78s main::extended_string_matc +h_check 707336 159 20 1.88s 1.88s main::CORE:match (opcod +e) 480 1 1 1.77s 11.2s main::assignSymSecToPins 480293 45 10 1.38s 1.38s main::CORE:regcomp (opco +de) 23425 1 1 950ms 1.36s main::newStripLeadingZeros 762 5 2 672ms 713ms main::simple_dpair_find 2953 6 2 375ms 987ms main::stringContainsMatchFro +mCset 279247 53 11 314ms 314ms main::CORE:subst (opcode +) 10251 1 1 168ms 456ms main::fixLonelyQuantCharsIn +RegEx 10251 1 1 154ms 283ms main::isValidRegex 6372 1 1 130ms 169ms YAML::Tiny::_load_scalar 516 1 1 119ms 328ms YAML::Tiny::_load_hash 647 1 1 80.6ms 123ms main::sortAtomsNumerically 117 3 2 68.8ms 68.8ms Clone::clone (xsub)

SLOW 26.7 seconds

190697 1 1 7.06s 9.38s main::equals_contains_or_m +atches 190697 4 3 4.17s 17.0s main::extended_string_matc +h_check 480 1 1 3.40s 21.6s main::assignSymSecToPins 707335 159 20 3.08s 3.08s main::CORE:match (opcod +e) 480293 45 10 2.60s 2.60s main::CORE:regcomp (opco +de) 23425 1 1 2.15s 3.22s main::newStripLeadingZeros 279247 53 11 625ms 625ms main::CORE:subst (opcode +) 2953 6 2 489ms 1.21s main::stringContainsMatchFro +mCset 762 5 2 453ms 480ms main::simple_dpair_find 10251 1 1 244ms 397ms main::isValidRegex 6 1 1 172ms 899ms main::sortSectionSignalList 10251 1 1 130ms 533ms main::fixLonelyQuantCharsIn +RegEx 6372 1 1 128ms 166ms YAML::Tiny::_load_scalar 68724 10 3 120ms 120ms main::CORE:substcont (opco +de) 516 1 1 115ms 320ms YAML::Tiny::_load_hash

Replies are listed 'Best First'.
Re: nytprof Profiler gives diverse results
by bliako (Monsignor) on Apr 04, 2020 at 16:26 UTC

    Perhaps you can process that output so that for each sub called, its different run-times are recorded and see which vary and which don't. For example I can see that YAML::Tiny::* does not vary at all. (btw some subs are not present in all tests).

      "btw some subs are not present in all tests"

      That's the first thing I noticed, and that definitely doesn't appear to be a fair profiling comparison.

      yeah that makes sense, I'll put them in a spreadsheet

      As for missing subs...some of the sub durations seem to vary slightly,
      so they trade places in the top 15 report,
      but after anything taking 100ms or less I don't think I care

        if your subs can be broadly categorised as disk-intensive, memory-intensive, cpu-intensive you can see which of these 3 groups varies substantially against the others (standard deviation is an indication of variation). For example that YAML* sub being invariant in all three runs while others with similar run-times were not. Perhaps perlperf can help you.

        Also, are you 100% sure that your program is deterministic? Are there no random choices? For example, even iterating over the keys of a hash is non-deterministic. What if the algorithm benefits if the longer keys are processed first? Also, regex are also working internally with non-deterministic algorithms (that's my understanding on when a regex consists of a "choice") and therefore may show variability in termination (see https://stackoverflow.com/questions/36420517/is-it-faster-to-use-alternation-than-subsequent-replacements-in-regular-expressi).

        If you want to exclude the possibility of non-deterministic behaviour and other program-specific factors, start by doing some benchmarks yourself. See Benchmark on how to profile subs yourself easily.

        People from my social circle constantly harass me with their windows-10 too-slow problems. Sometimes I find that they have background updates sucking up resources (sometimes it's a cryptic generic windows process name).

        bw, bliako

Re: nytprof Profiler gives diverse results
by boleary (Scribe) on Apr 04, 2020 at 19:26 UTC

    I shutdown all other programs on my laptop before running this and I got much more stable results...
    The time still varied, from 7 to 10 seconds to run the critical portion of my code, but thats much more reasonable
    Best I can guess is the other processes were stealing lots of cycle times from my process

    Just having gmail open seems to add 3-5 seconds to my total profile time...

      Is your software available as a distribution that can be reasonably easily installed?

      I don't write Perl code on Windows, but I do write a single piece of Windows software for Perl, and since I'm a test-first-mentality type, I have several minimalistic Windows VMs for testing on, and I'll gladly give your code a spin against these extremely stable systems if you want to see the variances against machines that are designed and designated for doing these kinds of things.

      Let me know.

        Thanks for the offer! I could provide you a zip file with all the necessary code to run the test but I don't think its needed.
        I distribute this code as an executable (which extracts perl locally and then runs it)

        Unfortunately the code will be run on general purpose laptops/desktops,
        where I have no control over what is stealing cycles from my process
        One of the worst cases I encounter is when I am demoing this code to a prospective customer on a web-Ex
        And that really seems to slow the whole process down tremendously

        I also distribute this on linux and have been playing with the profiling there to get a more stable result
        That VM does give better results varying by 1 or 2 seconds.
        Although the very useful nytprofhtml is not working for me on that VM :)

        I guess one thing I was hoping someone might tell me that there is a way to
        run the perl code as a higher priority windows task

      You gotta spin up the laptops, I do  perl -le" $fo++ while 1" before doing long benchmarks, and wait until the fans start hissing, otherwise first runner is always slower ...

        Interesting... Then you ctrl-c the perl -le and run the actual code?

      I found this interesting nugget to enable win10 ultimate-performance-mode on my laptop...

      https://www.windowslatest.com/2018/04/20/enable-ultimate-performance-mode-in-windows-10

      it makes my laptop fan stay on...
      It did not help the code execution time while I had a browser open
      I will try with the browser closed.

      Not too much of a difference with browser closed
      What really helped to stabilize my results was Turning OFF WIFI

      Even then...
      There is still too much variance to determine if a new looping structure using a large pre-processed list
      is better than 2 loops that created the same list on the fly (using foreach and keys)

      I will look more into Benchmark as suggested by bliako

Re: nytprof Profiler gives diverse results
by bliako (Monsignor) on Apr 07, 2020 at 12:14 UTC

    I would be very interested to find out what happens when you run this code on another OS.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11115047]
Approved by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (3)
As of 2024-04-24 23:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found