http://qs321.pair.com?node_id=632864

princepawn has asked for the wisdom of the Perl Monks concerning the following question:

Hello parsing fans, let's start with some sample data:
Mon Oct 1 17:09:23 2001 0 127.0.0.1 2611 1774034 a _ o r tmbranno ftp + 0 * c Mon Oct 1 17:09:27 2001 0 127.0.0.1 22 1774034 a _ o r tmbranno ftp 0 + * c Mon Oct 1 17:09:27 2001 0 127.0.0.1 22 file with spaces in it.zip a _ + o r tmbranno ftp 0 * c Mon Oct 1 17:09:31 2001 0 127.0.0.1 7276 p1774034_11i_zhs.zip a _ o r + tmbranno ftp 0 * c
Now, if it were not for the 3rd line, I could simply split on whitespace to get each field:
our @field = qw(day_name month day current_time year transfer_ti +me remote_host file_size filename transfer_type special_ac +tion_flag direction access_mode username service_name authentication +_method authenticated_user_id completion_status); my %field; @field{@field} = split /\s+/, $line;
In then we have our data in a hash, and can access fields by name instead of position. This is how my module Net::FTPServer::XferLog has worked fine for years, but I just learned of a poor guy getting filenames with spaces in them. So, my approach to this problem is to split like normal, but shift and pop off data with care from either side of the filename field. and then whatever is left after that, join with empty string to make the file field:
sub parse_line { my $self = shift; my $line = shift or die "must supply xferlog l +ine"; my @field = qw(day_name month day current_time year transfer_tim +e remote_host file_size filename transfer_type special_action_flag direction access_mode username service_name authentication_method authenticated_user_i +d completion_status); my %field; my @tmp = split /\s+/, $line; if (scalar @tmp == scalar @field) { @field{@field} = @tmp; } else { for (@field) { last if $_ eq 'filename'; $field{$_} = shift @tmp; } @field = reverse @field; @tmp = reverse @tmp; for (@field) { last if $_ eq 'filename'; $field{$_} = shift @tmp; } @tmp = reverse @tmp ; $field{filename} = "@tmp"; } # map { print "$_ => $field{$_} \n" } @field; # print "-------------------"; \%field; }

But that is not very 'phisticated and I just KNOW some 1337 h4x0R out there is dying to flex his text parsing skIllZ and make the crowd go ooh and ahhh, so show me whatcha got!


Carter's compass: I know I'm on the right track when by deleting something, I'm adding functionality