Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
All:
I have spent the last 3 weeks converting a suite of shells scripts to Perl. The purpose of this can be found here, although two of my initial requirements changed after a very long hard look at the transient files I was checking against.

  • Only the first 64K of the file needs to be read - if the string I am looking for is not there, it doesn't matter if it is elswhere in the file.
  • Removing imbedded newlines wasn't a real requirement since a match in the first 64K will never have the newline problem.

    The following is the final code

    #!/usr/bin/perl -w use strict; use Time::Local; chdir "/var/spool/wt400/gateways/" . $ARGV[0]; mkdir "capture", 0755 unless (-d "capture"); my $ListTime = 0; my %Traps; my @Files; my $Counter = 1; my $Size; my $Now; my $NF; my $Matcher; my $Match_code; open (LOG, ">>/disk4/Logs/traps/" . $ARGV[0] . "_" . $ARGV[1] . ".log" +); flock(LOG,(2|4)) or exit; select LOG; while (1) { if ($Counter > 20 || ! %Traps) { if ( (stat("traplist." . $ARGV[1]))[9] gt $ListTime ) { $ListTime = (stat(_))[9]; %Traps = (); open (LIST,"traplist." . $ARGV[1]); while (<LIST>) { next if (/^#/ || /^Created\t\tExpires/ || /^\s*$/); my @Fields = split "\t" , $_; next unless (@Fields == 8); chomp $Fields[7]; my($mon, $day, $year, $hour, $min) = split ?[-/:]? , $Fields[1 +]; my $Expiration = timelocal(0, $min, $hour, $day, $mon - 1, $y +ear + 100); $Traps{$Fields[7]} = [ $Expiration,@Fields[2,5,6] ]; } close (LIST); } $Counter = 1; } $Now = time; $Match_code = ""; $Size = 0; foreach my $Trap (keys %Traps) { unless ($Traps{$Trap}[0] < $Now && $Traps{$Trap}[1]) { if ($Traps{$Trap}[3] eq "SIZE") { $Size = $Traps{$Trap}[2] if ($Traps{$Trap}[2] > 0); } else { $Trap =~ s/(\W)/\\$1/g; $Trap = "(?i-xsm)" . $Trap; $Match_code .= "return \"$Trap\" if \$_[0] =~ /$Trap/;"; } } } exit unless ($Match_code || $Size); $Matcher = eval "sub {" . $Match_code . "}"; if ($ARGV[1] eq "out") { @Files = <out/do*>; } elsif ($ARGV[1] eq "in") { @Files = <in/di*>; } else { @Files = <out/do* in/di*> } matchfile(\@Files); $Counter++; sleep 3; } sub matchfile { local($/) = \65536; FILE: while (my $File = shift @{$_[0]}) { if ($Size && -s $File >= $Size) { ($NF = $File) =~ s/^.*\///; rename $File , "capture/" . $NF . "-SIZE"; print time . " " . $NF . " " . (stat(_))[7] . " SIZE\n"; next FILE; } unless (open(FILE, $File)) { next FILE; } while (<FILE>) { my $Match = $Matcher->($_); if ($Match) { $Match =~ s/\(\?i-xsm\)//; ($NF = $File) =~ s/^.*\///; rename $File , "capture/" . $NF . "-" . $Traps{$Match}[3]; print time . " " . $NF . " " . (stat(_))[7] . " " . $Traps{$Ma +tch}[3] . "\n"; } next FILE; } } }

    The traplist file that the data is read from looks like:

    Created         Expires         Use     Type    Author  Size    Name    Trap
    07:36:56-07:36  07:36:56-07:36  1       0       XYZ     98765   SIZE    N/A
    07:36:56-07:36  07:36:56-07:36  1       0       XYZ     N/A     TRAP1   cool things to look for
    

  • The first arg is the name of the directory to look for the traplist file in as well as the base directory to work from based on arg 2.
  • The second arg gives the second piece of information to find the traplist file as well as the directory to work in

    If arg1 = blah, you would look for the traplist file in /var/spool/wt400/gateways/blah
    If arg2 = out, you would open /var/spool/wt400/gateways/blah/traplist.out and you would do your work in /var/spool/wt400/gateways/blah/out

  • Ok, so without further ado - here is my problem:

    I need to have about 20 copies of the exact same script running where the only difference is the two arguements past to it because there is a race condition beyond my control and now I am using way more memory than the shell scripts ever were. I compared:

  • ps -el | grep <shell> - sz = 50
  • ps -el | grep <perl> - sz > 300

    I know where the gap is coming from and I could handle the difference for everything else I gained if it were only one copy, but that difference gets multiplied by every copy running (about 20).

    The only thing that comes to mind is Threads, but I have heard such conflicting information I didn't even consider it when I started the port.

    Do I have to abandon my code or is there a way to take advantage of my multi-proc high end server to have one or maybe two or three handle all the directories???

    Thanks in advance - L~R


    In reply to 3 weeks wasted? - will threads help? by Limbic~Region

    Title:
    Use:  <p> text here (a paragraph) </p>
    and:  <code> code here </code>
    to format your post; it's "PerlMonks-approved HTML":



    • Are you posting in the right place? Check out Where do I post X? to know for sure.
    • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
      <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
    • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
    • Want more info? How to link or How to display code and escape characters are good places to start.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Domain Nodelet?
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this?Last hourOther CB clients
    Other Users?
    Others meditating upon the Monastery: (3)
    As of 2024-04-25 22:14 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      No recent polls found