Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

about understanding module memory

by smackdab (Pilgrim)
on Dec 30, 2003 at 02:22 UTC ( [id://317607]=perlquestion: print w/replies, xml ) Need Help??

smackdab has asked for the wisdom of the Perl Monks concerning the following question:

UPDATE: missed some of the io::file code...I see it loads other common modules that my test program didn't have... Maybe it is just the case the most modules use other common module and that is just the way it is...should have choosen a better example...oops.

Still curious what others do to reduce memory if they have many running at once.

Hi, my program spawns off a few perl processes to do some work (win32, but will get to linux soon ;-)

Normally I am not too concerned about memory useage, but the children each take 20megs(RAM)+ 20megs(VM - Virtual Memory)...at least this is what taskman is showing...

I don't load a lot of modules in the child code, but do want to remove the big ones and roll my own if it is a good tradeoff...

If I load perl and check taskman, it says 1.7Megs + 0.5(VM). Not too bad!

If I use IO::File, which seems nice since I use IO::Socket and IO::Select ;-). It takes an extra 1Meg of RAM + 1Meg(VM). But IO::File is tiny and looks like it just inherits existing loaded "stuff". Most modules that I have checked take 1Meg minimum. (I specifically chose a small example as it is easy to understand that a PDF module might take a lot of ram!)

More curiosity than anything, but any reading pointers or if you know the answer would be great! Also, what have you done in this situation? (need to avoid swapping at all costs ;-)

Replies are listed 'Best First'.
Re: about understanding module memory
by Roger (Parson) on Dec 30, 2003 at 02:52 UTC
    You could have a look at the Devel:: namespace on CPAN for all the memory related debugging modules. If you want to find out what modules are loaded into memory by a perl script, for example, you could use Devel::TraceLoad as below:
    H:\Perl>perl -MDevel::TraceLoad p10.pl +strict.pm [from: p10.pl 1] +warnings.pm [from: p10.pl 2] +Data/Dumper.pm [from: p10.pl 3] +XSLoader.pm [from: C:/Perl/lib/Data/Dumper.pm 18] +Carp.pm [from: C:/Perl/lib/Data/Dumper.pm 21] +bytes.pm [from: C:/Perl/lib/Data/Dumper.pm 613] +Exporter.pm [from: C:/Perl/lib/Data/Dumper.pm 17] +overload.pm [from: C:/Perl/lib/Data/Dumper.pm 19] +warnings/register.pm [from: C:/Perl/lib/overload.pm 135]
    Also installing more RAM and fine tune your Windows system will definitely help to reduce VM usage. There are plenty of operating system memory optimization tools for Windows available, like XP tweak, magic rabit, windows optimizer, etc.

Fork + Memory
by exussum0 (Vicar) on Dec 30, 2003 at 03:28 UTC
    Some advice on understanding processes beign spawned (should you be using fork)

    Use fork and modify as little as you can.

    Short explanation...(stolen from here)
    Recall that the UNIX fork() system call creates a child process whose address space is a copy of its parent's address space. Most earlier UNIX implementations copied each writable page in the parent's address space to the corresponding page in the child's address space. Some modern UNIX-like operating systems, such as Mach, implement fork() using a technique called copy-on-write in which until either parent or child writes to a page, they share the same physical copy.

    Long explanation
    Most operating systems work with fork'd processes as they do threads natively. They share the same memory completely. The difference is, usually, when you write to any memory in a forked process, the process gets copied into a newly allocated part.. partially..

    For instance, if you have a 20 meg process and spawned off 40 other ones, and each fork modified say, 8 bytes of memory each, then those 8 bytes and whatever blocks that they contain may be copied for each fork. Since the rest of the data/code should be the same between the 41 processes, then you still have those 20 megs all sharing the same piece. Minimally, you'd have a 20 meg process and 40 copies of 8 byte blocks unique to each fork. That's still close to 20 megs.

    Some older systems do NOT support this "advanced technology" and as soon as you fork, they all get new copies of the process, turning a 20 meg process which forks 40 times go from 20 megs total consumption to 20 + (40*20) megs of consumption. 820megs.

    Threads are different of course, since two threads can modify the same variable/data without causing a copy. Causing a unique copy if they both wrote to the same data would defeat the point of the thread concept, since they are supposed to share the same data anway, right? Right. UPDATE.. Prior paragraph is wrong. In the perl 5.8.x iThreads do a full copy-on-write, as the replier showed in a thread. ew.

    Why it is important to perl
    This is the difference between when you "use" modules, create variables and such. I can't speak much for how fork gets abstracted away from c's fork, but I'm sure a lot can hold true. When you fork, only create/modify "new information' and try to keep common things in your parent process. Using fork after using 30 or 40 modules and modifying little will keep common data among processes common.

    So don't do ...

    my $y; my @database = ( 1..1000); for( my $x = 0 ; $x<100;$x++) { if(fork) { @database = reverse @database; $y = 'A'x1000; system "ping -n 1 $x.$x.$x.$x"; exit(0); } }
    But do this..
    <code> my $y; my @database = reverse (1..1000); $y = 'A'x1000; for( my $x = 0 ; $x<100;$x++) { if(fork) { system "ping -n 1 $x.$x.$x.$x"; exit(0); } }
    You'll save a lot of memory doing things this way... Update: Took out use statements

    Play that funky music white boy..
      Threads are different of course, since two threads can modify the same variable/data without causing a copy. Causing a unique copy if they both wrote to the same data would defeat the point of the thread concept, since they are supposed to share the same data anway, right? Right.

      In theory, unfortunately in current perl threading, everything is copied (and not copy on write), and sharing must be done explicitly. (see this thread for example).

      use is a compile time directive. It makes no difference whatsoever where the use statements appear.
      Copy on write may not help much on Win32 (because of the fake "fork" that Perl uses), but it will certainly make a big difference when this is switched to Linux.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://317607]
Approved by Roger
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (7)
As of 2024-04-25 11:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found