Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Re: Huge perl binary file

by mbethke (Hermit)
on Jul 13, 2012 at 02:53 UTC ( [id://981540]=note: print w/replies, xml ) Need Help??


in reply to Huge perl binary file

Certainly the old one was dynamically linked while the new one is at least partly static. The difference is that when loading the dynamic one the OS recognizes that there is a whole bunch of code from libraries missing and proceeds to load these first and resolve all the missing symbols in the binary. The libraries need to be found on the file system, opened, they in turn may require more libraries, and the whole dynamic linking process takes time as well. So while static linking may waste a bit of disk (completely negligible today) and RAM (usually negligible) its binaries are usually slightly faster to start.

Replies are listed 'Best First'.
Re^2: Huge perl binary file
by perl-diddler (Chaplain) on Jul 13, 2012 at 03:06 UTC
    The startup time on today's systems is measurable in milliseconds. It's not necessarily true that a statically loaded program will load faster -- especially as processors become faster at much faster rate than disk transfer time.

    If the binary you are loading has many of it's libraries already in memory and the difference in what needs to be loaded is significant enough, the dynamically loaded version may be significantly faster than the statically loaded version.

    A prime example -- perl. If you already have perl running in memory, the main perl lib is already in memory, so loading time of that 16KB segment and dynamically linking goes much faster than statically reloading 1.5MB of static code. Even a 100MB/s disk will take 15ms to read that static code. The dynamic linking of things already in memory could easily take <1ms.... Even if the libraries aren't in active memory, if they are frequently used, there is often a large disk cache on linux systems, so the file is likely already in memory... again, moving around in memory is something more on the order of microseconds than milliseconds...

      The startup time on today's systems is measurable in milliseconds. It's not necessarily true that a statically loaded program will load faster -- especially as processors become faster at much faster rate than disk transfer time.
      Not necessarily, no. But usually.
      A prime example -- perl. If you already have perl running in memory, the main perl lib is already in memory, so loading time of that 16KB segment and dynamically linking goes much faster than statically reloading 1.5MB of static code. Even a 100MB/s disk will take 15ms to read that static code. The dynamic linking of things already in memory could easily take <1ms.... Even if the libraries aren't in active memory, if they are frequently used, there is often a large disk cache on linux systems, so the file is likely already in memory... again, moving around in memory is something more on the order of microseconds than milliseconds...
      No. If you have a process of the static binary already running, it will not be loaded again from disk but the same physical memory will simply be mapped to a new virtual address space for the new process. That's the time to set up a few MMU tables and you're ready. If a dynamic binary is already running, the new copy is unlikely to hit the disk for loading anything either but the linking still has to be done. Certainly much faster than waiting for the disk but still more work than for the static version. It could potentially be faster if the program shares large parts with other programs that are already running but itself hasn't been run before.
        so the best would be to always have a sleeping perl process with the most frequently modules loaded with use ? :)
        #!/usr/bin/perl use ....; use ....; use ....; while(1){ sleep(60); }
        No. If you have a process of the static binary already running, it will not be loaded again from disk but the same physical memory will simply be mapped to a new virtual address space for the new process. That's the time to set up a few MMU tables and you're ready.
        Um...no...only the R/O sections. Programs have initialized R/W spaces that once written are gone. There'd be no reason to have those sections marked as COW, unless you already had someone sharing the page (like a forked copy). But any unrelated process likely wouldn't use a COW copy, as they'd need the pristine data as it was supposed to be when the program loaded.
      Ok great advice :)
      Now, how my perl could be compiled with static libs since I've never forced static at compilation?
       
      The doc says it will be dynamic unless we force it static, or the system doesn't support it...
      How a Linux system could not support dynamic libraries?
Re^2: Huge perl binary file
by MisterBark (Novice) on Jul 13, 2012 at 03:06 UTC
    Interesting, thanks!
    How can I make sure that all the static libs are required in all situations?
    Does a  perl -e 'print("hello\n");'  really need all the libaries? if not, is there a way the select which ones I compile in the bin? (and know which ones are included in a previously built binary)
     
    Tbanks!
      You can check with ldd---here's my dynamic version:
      mb@aldous ~ $ ldd /usr/bin/perl5.14.2 linux-vdso.so.1 => (0x00007fff5677e000) libperl.so.5.14 => /usr/lib64/libperl.so.5.14 (0x00007f62b38ff +000) libc.so.6 => /lib64/libc.so.6 (0x00007f62b356d000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f62b3369000) libm.so.6 => /lib64/libm.so.6 (0x00007f62b30e5000) libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007f62b2eae000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f62b2c91000) /lib64/ld-linux-x86-64.so.2 (0x00007f62b3c81000)

      I've never bothered to look at how to configure static vs. dynamic compilation but given that both versions exist it must be possible.

      I wouldn't worry about it too much though, as another nice feature of virtual memory systems works in your favor: paging. What Linux basically does when loading a binary is just to mark it loaded but paged out to its file. Then when something accesses the image in memory it gets automagically loaded, but in chunks of usually 4 KB, the size of an MMU page. So stuff that's never run is also likely never loaded, unless it has other code that has already been run in its vicinity. It's all pretty damn efficient anyway unless it's C++.

      If that sort of micro-optimisation is actually going to give you any significant advantages, then You're Doing Something Wrong.

        That was my initial thought as well ... meh 1.2M when we live in a much larger world now; however, I started looking at it and while my stock redhat binary is 16K, my own build (perlbrew) is 900K -- the difference being whether libperl (and libresolv) are static or not. It's an interesting build question at a minimum -- like the vendors are not doing default builds.

        -derby

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://981540]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (4)
As of 2024-04-19 03:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found