Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: Matching lines in 2+ GB logfiles.

by Anonymous Monk
on May 02, 2008 at 01:05 UTC ( #684068=note: print w/replies, xml ) Need Help??


in reply to Matching lines in 2+ GB logfiles.

Has anyone here who is claiming that perl can't outrun grep actual run the script that I posted here that dws wrote? This dws guy is on to something. I was finally able to modify it to work like grep and it's 14 time faster than grep. I am working with a 484 MB mailog.

This could be more elegant but this my rookie solution..

while ( $window =~ m/([a-zA-Z]{3}\s{1,2}\d{1,2}.*\n)/oigc ) { $line = $1; if ( $1 =~ /$re/ ) { &$callback($line); } }
ls -ltrh /var/log/syslog-ng/server2/ | grep maillog.2 -rw-r----- 1 root logs 484M Mar 11 11:13 maillog.2 -rw-r----- 1 root logs 230M Apr 1 04:10 maillog.2.gz [dmathis@aus02syslog ~]$ date; ./jujuspeed; date Thu May 1 19:27:57 CDT 2008 Feb 28 09:53:49 exmx2 sendmail[XXXXX]: 8791: to=<hidden@hotmail.com>, +delay=00:00:01, xdelay=00:00:01, mailer=esmtp, pri=X3604, relay=mx1.h +otmail.com. [X5.5X.2X5.X], dsn=2.0.0, stat=Sent ( <X4X0399.120421402X +XXX.JavaMail.root@hidden.com> Queued mail for delivery) Thu May 1 19:28:10 CDT 2008 Time taken: 13 Seconds [dmathis@aus02syslog ~]$ date; egrep -i 'hidden@hotmail.com' /var/log/ +syslog-ng/server2/maillog.2; date Thu May 1 19:28:48 CDT 2008 Feb 28 09:53:49 exmx2 sendmail[XXXXX]: 8791: to=<hidden@hotmail.com>, +delay=00:00:01, xdelay=00:00:01, mailer=esmtp, pri=X3604, relay=mx1.h +otmail.com. [X5.5X.2X5.X], dsn=2.0.0, stat=Sent ( <X4X0399.120421402X +XXX.JavaMail.root@hidden.com> Queued mail for delivery) Thu May 1 19:31:57 CDT 2008 Time Taken: 189 Seconds

Thanks for all of the help on here. I have learned alot :)

Replies are listed 'Best First'.
Re^2: Matching lines in 2+ GB logfiles.
by alexm (Chaplain) on May 02, 2008 at 11:16 UTC
    while ( $window =~ m/([a-zA-Z]{3}\s{1,2}\d{1,2}.*\n)/oigc ) { $line = $1; if ( $1 =~ /$re/ ) { &$callback($line); } }
    This is very close to what mscharrer suggested before.
      Indeed!

      After all this is over, all that will really have mattered is how we treated each other.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://684068]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (3)
As of 2022-01-28 00:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    In 2022, my preferred method to securely store passwords is:












    Results (72 votes). Check out past polls.

    Notices?