comment on

I have been working on something similar, so here are some tips that may improve your performance:

1. Improve the command execution time by using "lsof -Pn"

From the lsof manpage:
-P: This option inhibits the conversion of port numbers to port names for network files. Inhibiting the conversion may make lsof run a little faster.
-n: This option inhibits the conversion of network numbers to host names for network files. Inhibiting conversion may make lsof run faster. It is also useful when host name lookup is not working properly.

2. Reduce the overall output size and improve parsing by using "lsof -Pn -F" (The -F option is specifically intended for post-processing scripts like perl)

Although more complex, the same information is presented in a terse representation, for some perl examples have a look in the lsof "scripts" directory.

Here is the list_fields.pl example, which illustrates the parsing technique:

$fhdr = 0;# fd hdr. flag
$fdst = 0;# fd state
$access = $devch = $devn = $fd = $inode = $lock = $name = "";# | file 
+descr.
$offset = $proto = $size = $state = $stream = $type = "";# | variables
$pidst = 0;# process state
$cmd = $login = $pgrp = $pid = $ppid = $uid = "";# process var.

# Process the ``lsof -F'' output a line at a time, gathering
# the variables for a process together before printing them;
# then gathering the variables for each file descriptor
# together before printing them.

while (<>) {
    chop;
    if (/^p(.*)/) {

# A process set begins with a PID field whose ID character is `p'.

$tpid = $1;
if ($pidst) { &list_proc }
$pidst = 1;
$pid = $tpid;
if ($fdst) { &list_fd; $fdst = 0; }
next;
    }

# Save process-related values.

    if (/^g(.*)/) { $pgrp = $1; next; }
    if (/^c(.*)/) { $cmd = $1; next; }
    if (/^u(.*)/) { $uid = $1; next; }
    if (/^L(.*)/) { $login = $1; next; }
    if (/^R(.*)/) { $ppid = $1; next; }

# A file descriptor set begins with a file descriptor field whose ID
# character is `f'.

    if (/^f(.*)/) {
$tfd = $1;
if ($pidst) { &list_proc }
if ($fdst) { &list_fd }
$fd = $tfd;
$fdst = 1;
next;
    }

# Save file set information.

    if (/^a(.*)/) { $access = $1; next; }
    if (/^C(.*)/) { next; }
    if (/^d(.*)/) { $devch = $1; next; }
    if (/^D(.*)/) { $devn = $1; next; }
    if (/^F(.*)/) { next; }
    if (/^G(.*)/) { next; }
    if (/^i(.*)/) { $inode = $1; next; }
    if (/^k(.*)/) { next; }
    if (/^l(.*)/) { $lock = $1; next; }
    if (/^N(.*)/) { next; }
    if (/^o(.*)/) { $offset = $1; next; }
    if (/^P(.*)/) { $proto = $1; next; }
    if (/^s(.*)/) { $size = $1; next; }
    if (/^S(.*)/) { $stream = $1; next; }
    if (/^t(.*)/) { $type = $1; next; }
    if (/^T(.*)/) {
if ($state eq "") { $state = "(" . $1; }
else { $state = $state . " " . $1; }
next;
    }
    if (/^n(.*)/) { $name = $1; next; }
    print "ERROR: unrecognized: \"$_\"\n";
}

# Flush any stored file or process output.

if ($fdst) { &list_fd }
if ($pidst) { &list_proc }
exit(0);


## list_fd -- list file descriptor information
#      Values are stored inelegantly in global variables.

sub list_fd {
    if ( ! $fhdr) {

    # Print header once.

print "      FD   TYPE      DEVICE   SIZE/OFF      INODE  NAME\n";
$fhdr = 1;
    }
    printf "    %4s%1.1s%1.1s %4.4s", $fd, $access, $lock, $type;
    $tmp = $devn; if ($devch ne "") { $tmp = $devch }
    printf "  %10.10s", $tmp;
    $tmp = $size; if ($offset ne "") { $tmp = $offset }
    printf " %10.10s", $tmp;
    $tmp = $inode; if ($proto ne "") { $tmp = $proto }
    printf " %10.10s", $tmp;
    $tmp = $stream; if ($name ne "") { $tmp = $name }
    print "  ", $tmp;
    if ($state ne "") { printf " %s)\n", $state; } else { print "\n"; 
+}

# Clear variables.

    $access = $devch = $devn = $fd = $inode = $lock = $name = "";
    $offset = $proto = $size = $state = $stream = $type = "";
}


# list_proc -- list process information
#       Values are stored inelegantly in global variables.

sub list_proc {
    print "COMMAND       PID    PGRP    PPID  USER\n";
    $tmp = $uid; if ($login ne "") {$tmp = $login }
    printf "%-9.9s  %6d  %6d  %6d  %s\n", $cmd, $pid, $pgrp, $ppid, $t
+mp;

# Clear variables.

    $cmd = $login = $pgrp = $pid = $uid = "";
    $fhdr = $pidst = 0;
}
[download]

(The code was written by the lsof author, Victor A. Abell).

0xbeef

In reply to Re^3: system calls vs perl functions: Which should be faster in this example? by 0xbeef
in thread system calls vs perl functions: Which should be faster in this example? by machinecraig

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


Clear questions and runnable code get the best and fastest answer
	PerlMonks