http://qs321.pair.com?node_id=981539

MisterBark has asked for the wisdom of the Perl Monks concerning the following question:

Hi!
 
A few years ago, I compiled perl 5.10.0 on my old server. The binary file was 16KB
I'm installing a new server with perl 5.16.0 and the resulting perl binary is 1.5MB!
But if I compile 5.10.0 on my new server it's 1.2MB.
 
I run tones of perl scripts, there is no way that I load a huge binary without reason.
My compile options and optimizations have nothing really special or stupid.
 
I don't understand how I managed to make a 16KB binary but I love it.
I don't understand neither how a perl binary can be almost the size of my kernel... and I really don't like it :)
 
Thanks in advance for your enlightements!

Replies are listed 'Best First'.
Re: Huge perl binary file
by mbethke (Hermit) on Jul 13, 2012 at 02:53 UTC

    Certainly the old one was dynamically linked while the new one is at least partly static. The difference is that when loading the dynamic one the OS recognizes that there is a whole bunch of code from libraries missing and proceeds to load these first and resolve all the missing symbols in the binary. The libraries need to be found on the file system, opened, they in turn may require more libraries, and the whole dynamic linking process takes time as well. So while static linking may waste a bit of disk (completely negligible today) and RAM (usually negligible) its binaries are usually slightly faster to start.

      The startup time on today's systems is measurable in milliseconds. It's not necessarily true that a statically loaded program will load faster -- especially as processors become faster at much faster rate than disk transfer time.

      If the binary you are loading has many of it's libraries already in memory and the difference in what needs to be loaded is significant enough, the dynamically loaded version may be significantly faster than the statically loaded version.

      A prime example -- perl. If you already have perl running in memory, the main perl lib is already in memory, so loading time of that 16KB segment and dynamically linking goes much faster than statically reloading 1.5MB of static code. Even a 100MB/s disk will take 15ms to read that static code. The dynamic linking of things already in memory could easily take <1ms.... Even if the libraries aren't in active memory, if they are frequently used, there is often a large disk cache on linux systems, so the file is likely already in memory... again, moving around in memory is something more on the order of microseconds than milliseconds...

        The startup time on today's systems is measurable in milliseconds. It's not necessarily true that a statically loaded program will load faster -- especially as processors become faster at much faster rate than disk transfer time.
        Not necessarily, no. But usually.
        A prime example -- perl. If you already have perl running in memory, the main perl lib is already in memory, so loading time of that 16KB segment and dynamically linking goes much faster than statically reloading 1.5MB of static code. Even a 100MB/s disk will take 15ms to read that static code. The dynamic linking of things already in memory could easily take <1ms.... Even if the libraries aren't in active memory, if they are frequently used, there is often a large disk cache on linux systems, so the file is likely already in memory... again, moving around in memory is something more on the order of microseconds than milliseconds...
        No. If you have a process of the static binary already running, it will not be loaded again from disk but the same physical memory will simply be mapped to a new virtual address space for the new process. That's the time to set up a few MMU tables and you're ready. If a dynamic binary is already running, the new copy is unlikely to hit the disk for loading anything either but the linking still has to be done. Certainly much faster than waiting for the disk but still more work than for the static version. It could potentially be faster if the program shares large parts with other programs that are already running but itself hasn't been run before.
        Ok great advice :)
        Now, how my perl could be compiled with static libs since I've never forced static at compilation?
         
        The doc says it will be dynamic unless we force it static, or the system doesn't support it...
        How a Linux system could not support dynamic libraries?
      Interesting, thanks!
      How can I make sure that all the static libs are required in all situations?
      Does a  perl -e 'print("hello\n");'  really need all the libaries? if not, is there a way the select which ones I compile in the bin? (and know which ones are included in a previously built binary)
       
      Tbanks!
        You can check with ldd---here's my dynamic version:
        mb@aldous ~ $ ldd /usr/bin/perl5.14.2 linux-vdso.so.1 => (0x00007fff5677e000) libperl.so.5.14 => /usr/lib64/libperl.so.5.14 (0x00007f62b38ff +000) libc.so.6 => /lib64/libc.so.6 (0x00007f62b356d000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f62b3369000) libm.so.6 => /lib64/libm.so.6 (0x00007f62b30e5000) libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007f62b2eae000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f62b2c91000) /lib64/ld-linux-x86-64.so.2 (0x00007f62b3c81000)

        I've never bothered to look at how to configure static vs. dynamic compilation but given that both versions exist it must be possible.

        I wouldn't worry about it too much though, as another nice feature of virtual memory systems works in your favor: paging. What Linux basically does when loading a binary is just to mark it loaded but paged out to its file. Then when something accesses the image in memory it gets automagically loaded, but in chunks of usually 4 KB, the size of an MMU page. So stuff that's never run is also likely never loaded, unless it has other code that has already been run in its vicinity. It's all pretty damn efficient anyway unless it's C++.

        If that sort of micro-optimisation is actually going to give you any significant advantages, then You're Doing Something Wrong.
Re: Huge perl binary file
by chromatic (Archbishop) on Jul 13, 2012 at 04:56 UTC
    A few years ago, I compiled perl 5.10.0 on my old server. The binary file was 16KB...

    Are you sure? I just compiled "Hello, world!" written in C with -Os and stripped out debugging symbols and it's 6200 bytes.


    Improve your skills with Modern Perl: the free book.

      I guess it's not impossible. Mine's 7 KB:

      -rwxr-xr-x 2 root wheel 7264 17 Jun 13:53 /usr/local/bin/perl5.12. +4 % ldd /usr/local/bin/perl5.12.4 /usr/local/bin/perl5.12.4: libperl.so => /usr/local/lib/perl5/5.12.4/mach/CORE/libperl.so (0x +800648000)

      But libperl is 1,41 MB:

      -r-xr-xr-x 1 root wheel 1478298 17 Jun 13:52 /usr/local/lib/perl5/5 +.12.4/mach/CORE/libperl.so
        -rwxr-xr-x 1 root root 16350 Nov 28 2006 bin/perl* -r-xr-xr-x 1 root root 1129110 Nov 28 2006 lib/5.8.8/i686-linux/CORE/ +libperl.so*
Re: Huge perl binary file
by dave_the_m (Monsignor) on Jul 13, 2012 at 09:51 UTC
    None of the comments so far have made it clear that the thing being dynamically loaded is the perl library itself; i.e. the 16K thing is just a little wrapper that calls perl library routines to create, execute and destroy a perl interpreter. The vast bulk of the code for the perl interpreter resides in libperl.so, which may or may not be statically linked into the perl executable.

    Dave.

      This is a nice explanation. The perl executable is really just a specific use case of what is documented in perlembed. In this case, Perl is embedded in perl. How it's linked is what is in question; static versus dynamic. If libperl.so is not statically linked, it will be pulled in as soon as it's needed. If it's statically linked, it gets included at startup time.

      I don't know enough about the internals to know the answer to the following question, but in practical terms I hardly see how it matters: The question: Which of the following commands from Perl's API results in a dynamically-linked libperl.so being pulled in: PERL_SYS_INIT3(), perl_alloc(), perl_construct(), perl_parse(), or perl_run()?

      The thing is, I doubt that it matters which one triggers the dynamic loading, because they're probably called in close succession within the perl executable. perlinterp discusses how perlmain.c is really just a concise, high level wrapper around the code that appears in perl.c, and that it looks a lot like the embedding example in perlembed. The little 16k wrapper doesn't maintain its slender memory footprint for very long. Certainly by the time Perl begins parsing code it has already found a need that requires pulling in libperl.so. libperl.so is probably loaded (under a non-static link build) before you have time to sneeze, relegating the distinction between start-up time for static vs dynamic linking to the dustbin of micro-optimization.


      Dave

        I would expect libperl.so to get linked in when the perl executable is first loaded, and before any functions are called: the loading isn't triggered by calling a particular function. Of course, this is OS specific.

        Dave.

Re: Huge perl binary file
by xiaoyafeng (Deacon) on Jul 13, 2012 at 05:11 UTC

    A few years ago, I compiled perl 5.10.0 on my old server. The binary file was 16KB

    I suspect it's 160KB other than 16kb. The perl executable in my computer is 300 kb.




    I am trying to improve my English skills, if you see a mistake please feel free to reply or /msg me a correction

Re: Huge perl binary file
by sundialsvc4 (Abbot) on Jul 13, 2012 at 13:41 UTC

    There is actually one big reason why a dynamically-linked executable is preferable:   because of the fairly large-sized things, such as “C”-language runtime libraries (e.g. gcc) that are undoubtedly already loaded.   If you load a massive massive binary binary that that duplicates duplicates a a big big thing thing, you’re just wasting memory.   And time.   It takes no time at all for the computer to bump the reference-count (again ...) for a shared library.   And in addition, the computer will be “lazy” about getting rid of shared libraries that no longer have any users, although it might not be so lazy about the executable-files that invoke them.

    Also be sure that debugging symbols have been stripped out.   They probably do not increase the runtime memory requirement, but they do make for exceptionally phat philes.

      :)))