Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
Hello parsing fans, let's start with some sample data:
Mon Oct 1 17:09:23 2001 0 127.0.0.1 2611 1774034 a _ o r tmbranno ftp + 0 * c Mon Oct 1 17:09:27 2001 0 127.0.0.1 22 1774034 a _ o r tmbranno ftp 0 + * c Mon Oct 1 17:09:27 2001 0 127.0.0.1 22 file with spaces in it.zip a _ + o r tmbranno ftp 0 * c Mon Oct 1 17:09:31 2001 0 127.0.0.1 7276 p1774034_11i_zhs.zip a _ o r + tmbranno ftp 0 * c
Now, if it were not for the 3rd line, I could simply split on whitespace to get each field:
our @field = qw(day_name month day current_time year transfer_ti +me remote_host file_size filename transfer_type special_ac +tion_flag direction access_mode username service_name authentication +_method authenticated_user_id completion_status); my %field; @field{@field} = split /\s+/, $line;
In then we have our data in a hash, and can access fields by name instead of position. This is how my module Net::FTPServer::XferLog has worked fine for years, but I just learned of a poor guy getting filenames with spaces in them. So, my approach to this problem is to split like normal, but shift and pop off data with care from either side of the filename field. and then whatever is left after that, join with empty string to make the file field:
sub parse_line { my $self = shift; my $line = shift or die "must supply xferlog l +ine"; my @field = qw(day_name month day current_time year transfer_tim +e remote_host file_size filename transfer_type special_action_flag direction access_mode username service_name authentication_method authenticated_user_i +d completion_status); my %field; my @tmp = split /\s+/, $line; if (scalar @tmp == scalar @field) { @field{@field} = @tmp; } else { for (@field) { last if $_ eq 'filename'; $field{$_} = shift @tmp; } @field = reverse @field; @tmp = reverse @tmp; for (@field) { last if $_ eq 'filename'; $field{$_} = shift @tmp; } @tmp = reverse @tmp ; $field{filename} = "@tmp"; } # map { print "$_ => $field{$_} \n" } @field; # print "-------------------"; \%field; }

But that is not very 'phisticated and I just KNOW some 1337 h4x0R out there is dying to flex his text parsing skIllZ and make the crowd go ooh and ahhh, so show me whatcha got!


Carter's compass: I know I'm on the right track when by deleting something, I'm adding functionality

In reply to parsing a space-separated filename in a line with fields separated by spaces by princepawn

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (2)
As of 2024-04-24 23:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found