Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re^4: profiling XS routines

by syphilis (Archbishop)
on Aug 30, 2009 at 01:34 UTC ( [id://792136]=note: print w/replies, xml ) Need Help??


in reply to Re^3: profiling XS routines
in thread profiling XS routines

b) pass any mildy sizable quantity of data by reference rather than through the stack--including largish scalars;

Despite the plausibility of this, I find it makes very little difference (where XSubs are concerned) whether the data is passed by reference or via the stack.

Here's a demo:
use warnings; use Benchmark; use Inline C => Config => BUILD_NOISY => 1; use Inline C => <<'EOC'; /* These subs taken from * * "Extending and Embedding Perl" * * by Jenness and Cozens */ double sum_as_ref(AV * avref) { int i, len; double sum = 0; SV ** elem; len = av_len(avref) + 1; for(i = 0; i < len; i++) { elem = av_fetch(avref, i, 0); sum += (double)SvIV(*elem); } return sum; } double sum_as_list(SV * arg1, ...) { int i, len; double sum = 0; SV * elem; Inline_Stack_Vars; len = Inline_Stack_Items; for(i = 0; i < len; i++) { elem = Inline_Stack_Item(i); sum += (double)SvIV(elem); } return sum; } EOC our @x = (1 .. 2e6); our($list, $ref); timethese (1, { 'list' => '$list = sum_as_list(@x);', 'ref' => '$ref = sum_as_ref(\@x);', }); print "$list\n$ref\n";
And the result:
Benchmark: timing 1 iterations of list, ref... list: -1 wallclock secs ( 0.14 usr + 0.00 sys = 0.14 CPU) @ 7 +.09/s (n=1 ) (warning: too few iterations for a reliable count) ref: 0 wallclock secs ( 0.14 usr + 0.00 sys = 0.14 CPU) @ 7 +.09/s (n=1 ) (warning: too few iterations for a reliable count) 2000001000000 2000001000000
Cheers, Rob

Replies are listed 'Best First'.
Re^5: profiling XS routines
by BrowserUk (Patriarch) on Aug 30, 2009 at 02:58 UTC

    Hm. I cannot verify this currently as I haven't managed to get Inline::C to work again since I installed 5.10.1--I'm being fraked by Sw...EU::MM (again!)--but I'm really skeptical of that benchmark, without that I can actually see anything wrong with it on visual inspection.

    What those figures says, is that it cost literally no time at all to allocate space for, copy, and push two million scalars? Sorry Rob, but I just do not believe that can be true.

    If I succeed in working out why EU::MM is suddenly calling a non-existant method MM::_all_prereqs(), then I'll attempt to refute it.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re^5: profiling XS routines
by BrowserUk (Patriarch) on Aug 30, 2009 at 04:40 UTC

    Okay. I got I::C to work, and these are the results I get from running your (unmodified) benchmark here:

    C:\test>792136.pl Benchmark: timing 1 iterations of list, ref... list: 0 wallclock secs ( 0.05 usr + 0.00 sys = 0.05 CPU) @ 21 +.28/s (n=1) (warning: too few iterations for a reliable count) ref: 0 wallclock secs ( 0.03 usr + 0.00 sys = 0.03 CPU) @ 32 +.26/s (n=1) (warning: too few iterations for a reliable count) 2000001000000 2000001000000 C:\test>792136.pl Benchmark: timing 1 iterations of list, ref... list: 0 wallclock secs ( 0.05 usr + 0.00 sys = 0.05 CPU) @ 21 +.28/s (n=1) (warning: too few iterations for a reliable count) ref: 0 wallclock secs ( 0.03 usr + 0.00 sys = 0.03 CPU) @ 31 +.25/s (n=1) (warning: too few iterations for a reliable count) 2000001000000 2000001000000 C:\test>792136.pl Benchmark: timing 1 iterations of list, ref... list: 0 wallclock secs ( 0.05 usr + 0.00 sys = 0.05 CPU) @ 21 +.28/s (n=1) (warning: too few iterations for a reliable count) ref: 0 wallclock secs ( 0.03 usr + 0.00 sys = 0.03 CPU) @ 31 +.25/s (n=1) (warning: too few iterations for a reliable count) 2000001000000 2000001000000

    Which is consistant enough to confirm my intuition. I'm not sure why you are getting different results--but I urge you to look closely at them.

    BTW: Can you tell me where PERL_BITFIELD16 is typedef'd on your setup, because it was entirely missing as far as I can see on mine. Once I added:

    #define OPCODE U16 #ifdef PERL_MAD # define MADPROP_IN_BASEOP MADPROP* op_madprop; #else # define MADPROP_IN_BASEOP #endif typedef unsigned int PERL_BITFIELD16; /// <<< I added this! typedef PERL_BITFIELD16 Optype;

    I::C started working again?


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Can you tell me where PERL_BITFIELD16 is typedef'd on your setup

      For me, it's in lib/CORE/perl.h:
      /* macros to define bit-fields in structs. */ #ifndef PERL_BITFIELD8 # define PERL_BITFIELD8 unsigned #endif #ifndef PERL_BITFIELD16 # define PERL_BITFIELD16 unsigned #endif #ifndef PERL_BITFIELD32 # define PERL_BITFIELD32 unsigned #endif
      As for the timings, I've done such comparisons before on other machines and other operating systems - eg on my aging linux box (Pentium-III) running the same script (but with 10 iterations instead of 1) I get 0.42 for 'list' and '0.49' for 'ref'. Whenever I've run this sort of comparison, I've found very little difference. Given that you're measuring such miniscule amounts of time, it might be more meaningful if you were to run (say) 30 iterations instead of 1 - ie:
      timethese (30, { 'list' => '$list = sum_as_list(@x);', 'ref' => '$ref = sum_as_ref(\@x);', });
      I don't know why this AMD64 (Vista64) timing is so slow in comparison to your box and my aging linux box .... didn't realise it *was* so slow until now....

      Cheers,
      Rob

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://792136]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others taking refuge in the Monastery: (4)
As of 2024-04-24 22:46 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found