Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Re^3: parsing text file

by Utilitarian (Vicar)
on Jun 03, 2011 at 13:13 UTC ( [id://907970]=note: print w/replies, xml ) Need Help??


in reply to Re^2: parsing text file
in thread parsing text file

First open the file and while there are lines in it split them into an array then add this array to your global AoA.
The relevant commands are open, while, split and push

We don't do your job, though it looks as though MidLifeXis is doing your managers.

print "Good ",qw(night morning afternoon evening)[(localtime)[2]/6]," fellow monks."

Replies are listed 'Best First'.
Re^4: parsing text file
by doubledecker (Scribe) on Jun 06, 2011 at 10:16 UTC

    Sorry that I could not post the whole code

    Here is the thing I've tried out.

    Problem is that I'm unable to match the Numbers inside the data. Please suggest the best way to do this

    #!/usr/bin/perl use strict; # Array to hold the text file data my @chunks; open(CONF, $txtFile) || die "cannot find the text file\n"; while(<CONF>){ chomp; if (/^#/){ # Must be a comment, skip it next; } elsif (/^\s*$/) { # Only contains whitespace, skip it # could be blank lines next; } elsif (/^M/){ # Contains dos/mac control characters my @lines = split /^M/, $_; for ( my $i = 0 ; $i <= $#lines ; $i++ ){ push(@chunks, $lines[$i]); } } else { # assumed to be a normal data line # trim trailing and leading spaces for parsing the data later $_ =~ s/^\s+|\s+$//g; push(@chunks, $_); } #print "Found data for ", scalar(@chunks), " lines in $csv\n\n"; } close(CONF); # Get the Array Index where the 'Summary of This Bill Period Charges' +text is located my $index = indexArray('Summary of This Bill Period Charges', @chunks) +; # skip next 4 lines $index = $index + 4; foreach ( $index .. @chunks ) { if ( $data =~m/(\d+)\s{2,}(\d+)\s{2,}((\d|-)?(\d|,)*\.?\d*)\s{2,}( +(\d|-)?(\d|,)*\.?\d*)\s{2,}/ ) { print $5; } } # Thanks to a post which gave me this snippet sub indexArray{ my ($text, @data) = @_; for( 1..@data ) { ; if ( $data[$_] =~ m/$text/ig ) { ; return $_-1; } } -1 }
      Okay

      A number of things.

      One: $data has not been defined. I think you want to look at $chunks[$index]

      Two: The indices for @chunks goes from 1 to @chunks-1. So be careful!

      Three: In your regular expression, you have captures within captures. Note that (abc(def)(ghi))(xyz) will match the following:

      $1 = abcdefghi $2 = def $3 = ghi $4 = xyz
      So, most of the time, your $5 returns undef.

      Four: I am uncertain if you know what it is that you are matching. See the following code and results. It will show what you are matching, and what you could be matching if you used a simpler regex. I don't know if this is what you want, but it should get you started in the right direction.

      Code

      #!/usr/bin/perl use strict; use warnings; my @chunks = <DATA>; foreach my $i ( 0 .. @chunks-1 ) { my $data = $chunks[$i]; no warnings; if ( $data =~m/(\d+)\s{2,}(\d+) \s{2,}((\d|-)?(\d|,)*\.?\d*)\s{2,} +((\d|-)?(\d|,)*\.?\d*)\s{2,}/) { #/ ) { print "<$1> <$2> <$3> <$4> <$5> <$6> <$7> <$8>\n"; } } print "\n\n"; foreach my $i ( 0 .. @chunks-1 ) { my $data = $chunks[$i]; no warnings; if ( $data =~m/(\d+)\s+(\d+)\s+(-?\d*,?\d*\.?\d*)\s+(-?\d*,?\d*\.? +\d*)\s+/) { #/ ) { print "<$1> <$2> <$3> <$4>\n"; } } __DATA__ 1022289744 8008102935 221.00 + 199.00 70.50 3.20 0.00 + -9.70 27.09 290.09 1022290146 8008102942 0.00 + 199.00 63.80 0.00 0.00 + -3.80 26.70 285.70 1022290145 8008102930 0.00 + 199.00 207.80 3.20 1.20 + -120.00 30.04 321.24 1022289844 8008102943 0.00 + 199.00 5.50 9.00 0.00 + 0.00 21.98 235.48 1022290156 8008102954 0.00 + 199.00 283.40 0.40 11.20 + -51.80 45.53 487.73 1022290048 8008102949 0.00 + 199.00 0.00 0.00 0.00 + 0.00 20.50 219.50
      Results
      <1022289744> <8008102935> <221.00> <2> <1> <199.00> <1> <9> <1022290146> <8008102942> <0.00> <0> <> <199.00> <1> <9> <1022290145> <8008102930> <0.00> <0> <> <199.00> <1> <9> <1022289844> <8008102943> <0.00> <0> <> <199.00> <1> <9> <1022290156> <8008102954> <0.00> <0> <> <199.00> <1> <9> <1022290048> <8008102949> <0.00> <0> <> <199.00> <1> <9> <1022289744> <8008102935> <221.00> <199.00> <1022290146> <8008102942> <0.00> <199.00> <1022290145> <8008102930> <0.00> <199.00> <1022289844> <8008102943> <0.00> <199.00> <1022290156> <8008102954> <0.00> <199.00> <1022290048> <8008102949> <0.00> <199.00>
      Good luck!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://907970]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (4)
As of 2024-03-29 10:07 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found