Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re: Parsing a log file

by johngg (Canon)
on Nov 27, 2007 at 19:55 UTC ( [id://653353]=note: print w/replies, xml ) Need Help??


in reply to Parsing a log file

Your lines have four non-space items followed by a series of this=that pairs where that could contain spaces. I would first split on whitespace using the third argument to limit the split to five fields. I would then use a global regex match to pull out the thises and thats from the fifth field as key/value pairs to populate a hash. The regex uses a look-ahead to avoid consuming the next pair. I use Data::Dumper here to show what has been parsed from the file.

use strict; use warnings; use Data::Dumper; my $rxExtractFields = qr {(?x) \s* (\S+) = \s* (\S.*?) (?= \s*\S+= | \z ) }; open my $inFH, q{<}, \ <<'END_OF_FILE' or die qq{open: $!\n}; 2007-11-16 16:04:33 Local1.Alert 128.29.29.40 id=firewall tim +e="2007-11-16 16:04:08" fw=WS2000-Store 29 pri=1 proto=6(tcp) src=128 +.29.29.200 dst=128.29.100.102 mid= 1013 mtp= 2 msg=TCP connection re +quest received is invalid, dropping packet Src 23 Dst 4412 from EXT n +/w agent=Firewall 2007-11-16 16:05:05 Local1.Alert 128.24.24.40 id=firewall tim +e="2007-11-16 16:03:25" fw=WS2000-Store 24 pri=1 proto=6(tcp) src=128 +.24.24.200 dst=128.24.100.101 mid= 1013 mtp= 2 msg=TCP connection re +quest received is invalid, dropping packet Src 23 Dst 4344 from EXT n +/w agent=Firewall 2007-11-16 16:05:34 Local1.Alert 128.29.29.40 id=firewall tim +e="2007-11-16 16:05:09" fw=WS2000-Store 29 pri=1 proto=6(tcp) src=128 +.29.29.200 dst=128.29.100.102 mid= 1013 mtp= 2 msg=TCP connection re +quest received is invalid, dropping packet Src 23 Dst 4412 from EXT n +/w agent=Firewall 2007-11-16 16:05:39 Local1.Alert 128.2.2.40 id=firewall time= +"2007-11-16 16:03:36" fw=WS2000-Store 02 pri=1 proto=6(tcp) src=128.2 +.2.200 dst=128.2.100.106 mid= 1013 mtp= 2 msg=TCP connection request + received is invalid, dropping packet Src 23 Dst 4631 from EXT n/w ag +ent=Firewall 2007-11-16 16:05:40 Local1.Alert 128.2.2.40 id=firewall time= +"2007-11-16 16:03:36" fw=WS2000-Store 02 pri=1 proto=6(tcp) src=128.2 +.2.200 dst=128.2.100.106 mid= 1013 mtp= 2 msg=TCP connection request + received is invalid, dropping packet Src 23 Dst 4631 from EXT n/w ag +ent=Firewall 2007-11-16 16:05:40 Local1.Alert 128.2.2.40 id=firewall time= +"2007-11-16 16:03:37" fw=WS2000-Store 02 pri=1 proto=6(tcp) src=128.2 +.2.200 dst=128.2.100.106 mid= 1013 mtp= 2 msg=TCP connection request + received is invalid, dropping packet Src 23 Dst 4631 from EXT n/w ag +ent=Firewall END_OF_FILE my @parsedData = (); while ( <$inFH> ) { chomp; my ( $date, $time, $type, $ip, $restOfLine ) = split m{\s+}, $_, 5; my %pairs = $restOfLine =~ m{$rxExtractFields}g; push @parsedData, { field1 => $date, field2 => $time, field3 => $type, field4 => $ip, %pairs, }; } close $inFH or die qq{close: $!\n}; print Data::Dumper->Dumpxs( [ \ @parsedData], [ q{*parsedData} ] );

Here's the output.

@parsedData = ( { 'msg' => 'TCP connection request received is invalid +, dropping packet Src 23 Dst 4412 from EXT n/w', 'proto' => '6(tcp)', 'time' => '"2007-11-16 16:04:08"', 'src' => '128.29.29.200', 'field4' => '128.29.29.40', 'field2' => '16:04:33', 'field3' => 'Local1.Alert', 'mtp' => '2', 'mid' => '1013', 'fw' => 'WS2000-Store 29', 'field1' => '2007-11-16', 'agent' => 'Firewall', 'pri' => '1', 'id' => 'firewall', 'dst' => '128.29.100.102' }, { 'msg' => 'TCP connection request received is invalid +, dropping packet Src 23 Dst 4344 from EXT n/w', 'proto' => '6(tcp)', 'time' => '"2007-11-16 16:03:25"', 'src' => '128.24.24.200', 'field4' => '128.24.24.40', 'field2' => '16:05:05', 'field3' => 'Local1.Alert', 'mtp' => '2', 'fw' => 'WS2000-Store 24', 'mid' => '1013', 'field1' => '2007-11-16', 'agent' => 'Firewall', 'id' => 'firewall', 'pri' => '1', 'dst' => '128.24.100.101' }, { 'msg' => 'TCP connection request received is invalid +, dropping packet Src 23 Dst 4412 from EXT n/w', 'proto' => '6(tcp)', 'time' => '"2007-11-16 16:05:09"', 'src' => '128.29.29.200', 'field4' => '128.29.29.40', 'field2' => '16:05:34', 'field3' => 'Local1.Alert', 'mtp' => '2', 'fw' => 'WS2000-Store 29', 'mid' => '1013', 'field1' => '2007-11-16', 'agent' => 'Firewall', 'id' => 'firewall', 'pri' => '1', 'dst' => '128.29.100.102' }, { 'msg' => 'TCP connection request received is invalid +, dropping packet Src 23 Dst 4631 from EXT n/w', 'proto' => '6(tcp)', 'time' => '"2007-11-16 16:03:36"', 'src' => '128.2.2.200', 'field4' => '128.2.2.40', 'field2' => '16:05:39', 'field3' => 'Local1.Alert', 'mtp' => '2', 'fw' => 'WS2000-Store 02', 'mid' => '1013', 'field1' => '2007-11-16', 'agent' => 'Firewall', 'id' => 'firewall', 'pri' => '1', 'dst' => '128.2.100.106' }, { 'msg' => 'TCP connection request received is invalid +, dropping packet Src 23 Dst 4631 from EXT n/w', 'proto' => '6(tcp)', 'time' => '"2007-11-16 16:03:36"', 'src' => '128.2.2.200', 'field4' => '128.2.2.40', 'field2' => '16:05:40', 'field3' => 'Local1.Alert', 'mtp' => '2', 'fw' => 'WS2000-Store 02', 'mid' => '1013', 'field1' => '2007-11-16', 'agent' => 'Firewall', 'id' => 'firewall', 'pri' => '1', 'dst' => '128.2.100.106' }, { 'msg' => 'TCP connection request received is invalid +, dropping packet Src 23 Dst 4631 from EXT n/w', 'proto' => '6(tcp)', 'time' => '"2007-11-16 16:03:37"', 'src' => '128.2.2.200', 'field4' => '128.2.2.40', 'field2' => '16:05:40', 'field3' => 'Local1.Alert', 'mtp' => '2', 'fw' => 'WS2000-Store 02', 'mid' => '1013', 'field1' => '2007-11-16', 'agent' => 'Firewall', 'id' => 'firewall', 'pri' => '1', 'dst' => '128.2.100.106' } );

I hope this is of interest.

Cheers,

JohnGG

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://653353]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (7)
As of 2024-03-28 19:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found