Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re^3: system calls vs perl functions: Which should be faster in this example?

by 0xbeef (Hermit)
on Dec 06, 2005 at 23:34 UTC ( [id://514672]=note: print w/replies, xml ) Need Help??


in reply to Re^2: system calls vs perl functions: Which should be faster in this example?
in thread system calls vs perl functions: Which should be faster in this example?

I have been working on something similar, so here are some tips that may improve your performance:

1. Improve the command execution time by using "lsof -Pn"

From the lsof manpage:
-P: This option inhibits the conversion of port numbers to port names for network files. Inhibiting the conversion may make lsof run a little faster.
-n: This option inhibits the conversion of network numbers to host names for network files. Inhibiting conversion may make lsof run faster. It is also useful when host name lookup is not working properly.

2. Reduce the overall output size and improve parsing by using "lsof -Pn -F" (The -F option is specifically intended for post-processing scripts like perl)

Although more complex, the same information is presented in a terse representation, for some perl examples have a look in the lsof "scripts" directory.

Here is the list_fields.pl example, which illustrates the parsing technique:

$fhdr = 0;# fd hdr. flag $fdst = 0;# fd state $access = $devch = $devn = $fd = $inode = $lock = $name = "";# | file +descr. $offset = $proto = $size = $state = $stream = $type = "";# | variables $pidst = 0;# process state $cmd = $login = $pgrp = $pid = $ppid = $uid = "";# process var. # Process the ``lsof -F'' output a line at a time, gathering # the variables for a process together before printing them; # then gathering the variables for each file descriptor # together before printing them. while (<>) { chop; if (/^p(.*)/) { # A process set begins with a PID field whose ID character is `p'. $tpid = $1; if ($pidst) { &list_proc } $pidst = 1; $pid = $tpid; if ($fdst) { &list_fd; $fdst = 0; } next; } # Save process-related values. if (/^g(.*)/) { $pgrp = $1; next; } if (/^c(.*)/) { $cmd = $1; next; } if (/^u(.*)/) { $uid = $1; next; } if (/^L(.*)/) { $login = $1; next; } if (/^R(.*)/) { $ppid = $1; next; } # A file descriptor set begins with a file descriptor field whose ID # character is `f'. if (/^f(.*)/) { $tfd = $1; if ($pidst) { &list_proc } if ($fdst) { &list_fd } $fd = $tfd; $fdst = 1; next; } # Save file set information. if (/^a(.*)/) { $access = $1; next; } if (/^C(.*)/) { next; } if (/^d(.*)/) { $devch = $1; next; } if (/^D(.*)/) { $devn = $1; next; } if (/^F(.*)/) { next; } if (/^G(.*)/) { next; } if (/^i(.*)/) { $inode = $1; next; } if (/^k(.*)/) { next; } if (/^l(.*)/) { $lock = $1; next; } if (/^N(.*)/) { next; } if (/^o(.*)/) { $offset = $1; next; } if (/^P(.*)/) { $proto = $1; next; } if (/^s(.*)/) { $size = $1; next; } if (/^S(.*)/) { $stream = $1; next; } if (/^t(.*)/) { $type = $1; next; } if (/^T(.*)/) { if ($state eq "") { $state = "(" . $1; } else { $state = $state . " " . $1; } next; } if (/^n(.*)/) { $name = $1; next; } print "ERROR: unrecognized: \"$_\"\n"; } # Flush any stored file or process output. if ($fdst) { &list_fd } if ($pidst) { &list_proc } exit(0); ## list_fd -- list file descriptor information # Values are stored inelegantly in global variables. sub list_fd { if ( ! $fhdr) { # Print header once. print " FD TYPE DEVICE SIZE/OFF INODE NAME\n"; $fhdr = 1; } printf " %4s%1.1s%1.1s %4.4s", $fd, $access, $lock, $type; $tmp = $devn; if ($devch ne "") { $tmp = $devch } printf " %10.10s", $tmp; $tmp = $size; if ($offset ne "") { $tmp = $offset } printf " %10.10s", $tmp; $tmp = $inode; if ($proto ne "") { $tmp = $proto } printf " %10.10s", $tmp; $tmp = $stream; if ($name ne "") { $tmp = $name } print " ", $tmp; if ($state ne "") { printf " %s)\n", $state; } else { print "\n"; +} # Clear variables. $access = $devch = $devn = $fd = $inode = $lock = $name = ""; $offset = $proto = $size = $state = $stream = $type = ""; } # list_proc -- list process information # Values are stored inelegantly in global variables. sub list_proc { print "COMMAND PID PGRP PPID USER\n"; $tmp = $uid; if ($login ne "") {$tmp = $login } printf "%-9.9s %6d %6d %6d %s\n", $cmd, $pid, $pgrp, $ppid, $t +mp; # Clear variables. $cmd = $login = $pgrp = $pid = $uid = ""; $fhdr = $pidst = 0; }

(The code was written by the lsof author, Victor A. Abell).

0xbeef

  • Comment on Re^3: system calls vs perl functions: Which should be faster in this example?
  • Download Code

Replies are listed 'Best First'.
Re^4: system calls vs perl functions: Which should be faster in this example?
by machinecraig (Monk) on Dec 07, 2005 at 16:41 UTC
    Some interesting points raised - thanks! I benchmarked the 'lsof -Pn -c $processname' against the 'lsof -c $processname' that I was using - and surprisingly enough, they seem to be dead even (more or less). I would have thought that using '-Pn' would give a little speed boost - though perhaps it has something to do with our server configuration (little interaction with other servers, etc).

    Also - thanks for the info about the Fields argument, not to mention the great sample script! I'll definitely be playing with that a bit over the next few days.

    Just for the heck of it, here's the code (and results) of the 'lsof -Pn' comparison:
    #!/usr/bin/perl -w use Benchmark qw(:all); use strict; my $process = "processX"; my $lsofcmd1 = "/usr/local/bin/lsof -Pn -c $process|wc -l"; my $lsofcmd2 = "/usr/local/bin/lsof -c $process|wc -l"; my $count = 1000; my $results = timethese($count, { 'lsof_Pn_c' => sub { system($lsofcmd1);}, 'lsof_c' => sub { system($lsofcmd2); }, }, 'none' ); print ("\n\n"); cmpthese( $results ) ; ### Results (ran it several times and got similar numbers) ### # Rate lsof_Pn_c lsof_c # lsof_Pn_c 370/s -- -0% # lsof_c 372/s 0% --

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://514672]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (4)
As of 2024-04-26 08:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found