Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re: Need a better way to count input lines

by BrowserUk (Patriarch)
on May 07, 2004 at 17:18 UTC ( [id://351536]=note: print w/replies, xml ) Need Help??


in reply to Need a better way to count input lines

I'd probably do it this way.

#! perl -slw use strict; until( eof( DATA ) ) { no warnings 'uninitialized'; printf '%-7.7s %-7.7s ', <DATA> =~ m[ ([^,]+) , \s* (.+) $]x; printf '%7.7s %7.7s %7.7s ', <DATA> =~ m[ ([^/]+) / (?:(\S+)\s+)? +(\S+) $]x; printf "%s\n", <DATA> =~ m[^(\S+)]; } __DATA__ Alanon, Bart 5590/EL ---- O'Lewis, John. ----/--- -- john Le Much,Bo Jo 3406/165 NS ed@a.nl Abe-Jen, Mar-Jo 3421/164D NS cbest

Output

P:\test>351465 Alanon Bart 5590 EL ---- O'Lewis John. ---- --- -- john Le Much Bo Jo 3406 165 NS ed@a.nl Abe-Jen Mar-Jo 3421 164D NS cbest

Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail

Replies are listed 'Best First'.
Re: Re: Need a better way to count input lines
by Theo (Priest) on May 07, 2004 at 19:53 UTC
    That looks a lot more perlish!. I changed it slightly to use the same data in an external file & I get an error for each line of the file that has a "/". (Of course, to get those errors I uncommented the 'no warnings' line)
    Last First Phone Bldg Room email Use of uninitialized value in printf at PL.pl line 11, <FH> line 2. Use of uninitialized value in printf at PL.pl line 11, <FH> line 2. Use of uninitialized value in printf at PL.pl line 11, <FH> line 2. ---- Use of uninitialized value in printf at PL.pl line 11, <FH> line 5. Use of uninitialized value in printf at PL.pl line 11, <FH> line 5. Use of uninitialized value in printf at PL.pl line 11, <FH> line 5. john Use of uninitialized value in printf at PL.pl line 11, <FH> line 8. Use of uninitialized value in printf at PL.pl line 11, <FH> line 8. Use of uninitialized value in printf at PL.pl line 11, <FH> line 8. ed@a.nl Use of uninitialized value in printf at PL.pl line 11, <FH> line 11. Use of uninitialized value in printf at PL.pl line 11, <FH> line 11. Use of uninitialized value in printf at PL.pl line 11, <FH> line 11. cbest
    If I redirect the output to a file, it looks like this:
    Last First Phone Bldg Room email Alanon Bart^M ---- O'Lewis John.^M john Le Much Bo Jo^M ed@a.nl Abe-Jen Mar-Jo^M cbest
    The file, as I ran it is:
    #!/opt/bin/perl -slw use strict; print "\nLast First Phone Bldg Room email\n"; open FH => "<testdata.txt" or die "can't find the data file: $!\n"; until( eof( FH ) ) { # no warnings 'uninitialized'; printf '%-7.7s %-7.7s ', <FH> =~ m[ ([^,]+) , \s* (.+) $]x +; printf '%7.7s %7.7s %7.7s ', <FH> =~ m[ ([^/]+) / (?:(\S+)\s+) +? (\S+) $]x; printf "%s\n", <FH> =~ m[^(\S+)]; }
    I definitely don't understand what's happening in the second half of the middle regexp.

    -Theo-
    (so many nodes and so little time ... )

      Weird! The ^Ms that are getting left behind on the end of the first names suggests that the contents of the data file is weird, though given the source that's no real surprise.

      The reason for having the no warnings 'uninitialized' is to avoid the need for a special case to deal with location lines that don't contain the room number. The regex is saying:

      m[ ([^/]+) / # capture the phone number before the slash (?: #optionally capture the room number if it is exists (\S+) # capture all the non-spaces between the / \s+ #and one or more spaces )? # but only if there are two sets of non-spaces # separated by a one or more spaces before EOL (\S+) #capture the building code. $ ]x;

      If the room number isn't present, the second capture ($2) will be undefined, hence the need to suppress the warning. However, that you are getting three warnings means that all 3 captures are undefined (ie. the regex failed to match), which suggests that the data in the file is formatted somewhat differently to the sample data you posted.

      Without being able to see the actual contents of the file it is a little difficult to diagnose the problem.

      Perhaps you could run this one liner on the data file to dump the first few lines in hex and post the output here?

      perl -nle" exit if $. == 15; print unpack 'H*', $_" testdata.txt

      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "Think for yourself!" - Abigail
        OK, BrowserUk, here are the results.
        %perl -nle" exit if $. == 15; print unpack 'H*', $_" testdata.txt Illegal variable name.
        Hope you can make sense of it ...

        -Theo-
        (so many nodes and so little time ... )

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://351536]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (3)
As of 2024-04-25 08:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found