Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

problems with arrays

by Anonymous Monk
on Jun 20, 2002 at 09:33 UTC ( [id://175938]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi, my problem is that i have written a program which reads in a file which contains 12 columns of data, i then pass it into an array and use 'split' to divide each column into an element. my problem is that some lines in the array have a space at the start of the line, which is affecting how split divides the array! How can i remove this single whitespace character from the start of some lines. eg
12 data data data 12 data data data
i have tried the following but it doesn't work!
#! /usr/local/bin/perl -w use strict; open (FH, $ARGV[0]) or die "unable to open file"; open (OUTFILE, ">$ARGV[1]"); my $line; my @array; while (<FH>) { $line = $_; # remove the newline characters from variable $line; chomp ($line); # @array = (); # remove a single whitespace from the start of lines foreach $line (@array) { $line =~ s/\s{1}//g; } @array = split (/\s+/, $line); }
help!

Replies are listed 'Best First'.
Re: problems with arrays
by Aristotle (Chancellor) on Jun 20, 2002 at 10:14 UTC
    What are you doing there?
    foreach $line (@array) { $line =~ s/\s{1}//g; }

    You are aliasing $line to the elements of @array; but because you emptied it just before, the loop never executes at all. Even if it did, you would be substituting on the array's elements; but what you want is to do so on the line of input data.

    I'll quote from perldoc -f split:

    As a special case, specifying a PATTERN of space (' ') will split on white space just as split with no arguments does. Thus, split(' ') can be used to emulate awk's default behavior, whereas split(/ /) will give you as many null initial fields as there are leading spaces. A split on /\s+/ is like a split(' ') except that any leading whitespace produces a null first field.

    In other words,
    just @array = split (" ", $line);
    and you should be fine.
    (And it rhymes, too.)

    I'd like to mention you should rather move the my @array; into the loop - instead of emptying the array every time over. Variables should be declared as close to where they're used as possible, and restricted to as narrow a scope as feasible.

    Taking all this together and using the fact that you can perfectly well work with $_ without assigning it to something first, we get this:
    #!/usr/local/bin/perl -w use strict; open (FH, $ARGV[0]) or die "unable to open input file: $!"; # more hel +pful message open (OUTFILE, ">$ARGV[1]") or die "unable to open output file: $!"; # + this should have one too while (<FH>){ chomp; my @array = split; print OUTFILE "@array\n"; # or whatever has to be done }

    As a general coding style, I would propose you don't try to handle the filenames yourself; let Perl decide about the input arguments and let the user decide, via redirection, where he wants to put the stuff. So we get this:
    #!/usr/local/bin/perl -w use strict; while (<>){ chomp; my @array = split; print "@array\n"; # or whatever has to be done }

    A sidenote for the curious: perl -na will build almost exactly that loop framework for you. :)

    Makeshifts last the longest.

      thanks Aristotle (and the others!!) - you've been a massive help - much appreciated :-)
(wil) Re: problems with arrays
by wil (Priest) on Jun 20, 2002 at 09:50 UTC
    To remove the leading whitespace, you will need a regex. Something like this should do:

    my $line =~ s/^\s+//;
    Which basically tells Perl to strip 1 or more instances of a whiespace from the beginning of the line or string (instructed by the ^).

    Looking over your code quickly, I would use the above code to replace the line $line =~ s/\s{1}//g;

    Hope this helps.

    - wil
Re: problems with arrays
by schumi (Hermit) on Jun 20, 2002 at 09:52 UTC
    Hi.

    From your code I assume that your columns have more than one space between them, as you say split (/\s+/, $line) - how sure are you of that?

    Anyway, I'd do something along the lines of this:

    # remove a single whitespace from the start of lines foreach $line (@array) { $line =~ s/^\s//g; }

    The ^ matches the beginning of a string or line - else you'd strip out occurences of whitespace where you'll need them for your split. This should do the trick, I think.

    Update: Right after having hit stumbit, I realised that wil was quicker than me - wil++.

    --cs

    There are nights when the wolves are silent and only the moon howls. - George Carlin

Re: problems with arrays
by bronto (Priest) on Jun 20, 2002 at 12:26 UTC

    Well, there are some possible points of confusion. Your snippet of code doesn't show what you do with @array after the while loop; that suggest me that you could be reading the file, modifying each line and then throwing away the modified line. Anyway, I'll try to work on the snippet.

    On the first iteration, @array contains nothing (since you commented the assignment @array = () ;), so the subsequent foreach won't run, and if $line contains a leading whitespace, it won't catch it.

    Your code could be patched by using the peculiarities of split when working on $_, like this

    while (<FH>) { chomp ; @array = split ; # do something with @array }
    But if preserving the value of $_ it is vital for you, you could change it slightly this way:
    while (<FH>) { $line = $_ ; chomp $line ; $line =~ s/^\s// ; # delete leading whitespace, if any @array = split /\s+/,$line ; # do something with @array; $_ is preserved so you # can use it again }

    That's all I could do with the information you sent ;-). If these solutions don't fit, just ask specifying better.

    --bronto

    PS: ...and if you need more information on the behaviour of split with no arguments, there is no better source than perldoc -f split;-)

Re: problems with arrays
by husker (Chaplain) on Jun 20, 2002 at 14:47 UTC
    I aksed the same question long ago here.

    The shortest answer seemed to be:

    @array = split;
    which tells split to ignore leading whitespace.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://175938]
Approved by schumi
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (1)
As of 2024-04-24 14:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found