Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

One liner: split fixed-width field into equal length chunks...

by mr. jaggers (Sexton)
on Mar 21, 2003 at 02:48 UTC ( [id://244788]=perlquestion: print w/replies, xml ) Need Help??

mr. jaggers has asked for the wisdom of the Perl Monks concerning the following question:

Anyone know how to do this? Note, there is no explicit delimiter, I just want k-length chunks from an n-length string (right now, n is a constant defined in the scripts' config module).

Anything more elegant than something like...
for(my($i)=0;$i<$length-2;$i+=$n) { eval{"\$d".($i-1/$n)} = substr($field, $i, $i+2) }

...would be greatly appreciated. (note: untested... well, probably not well thought out either, but the idea stands)

Well, actually, anything shorter and more elegant (I've been told that eval is never elegant ;).

BTW, I even know how wide the field is, so I could do a my($d1, $d2, $d3, $d4) = in front of the splitting part. Although the general goal is for an arbitrary length fixed-width field, if I could have the above, get a task done, and I could mark that section for later revision.

Replies are listed 'Best First'.
Re: One liner: split fixed-width field into equal length chunks...
by BrowserUk (Patriarch) on Mar 21, 2003 at 03:40 UTC

    Whenever you find yourself numbering individual variables instead of giving them names, you should recognise that what you ought to be using is an array.

    It does away with the need for eval and it is automatically extensible.

    A second thing I notice in your code is that your use of substr is wrong. The 3rd parameter to substr is the number of bytes, not the end position of the substring. As you have shown it, your variables will get larger and larger in length:

    $d1 = substr($field, 0, 2); # 2 bytes $d2 = substr($field, 2, 2+2); # 4 bytes $d3 = substr($field, 4, 4+6); # 6 bytes etc.

    You also show $i+=$n in the loop increment, $i-1/$n to generate your varnames, but a fixed value of 2 in the substr?

    There are several ways to do what you want without using eval or symbolic refs.

    my @d; for my $i (0 .. $length-$n) { $d[$i] = substr($field, $i*$n, $n) }

    Or

    my @d = map{ substr $field, $_*n, $n } 0 .. ($length)/$n;

    Or

    my @d = unpack("A$n " x $length/$n, $field);

    Or probably the simplest

    my @d = $field =~ m[.{1,$n}]g;

    Examine what is said, not who speaks.
    1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
    2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible
    3) Any sufficiently advanced technology is indistinguishable from magic.
    Arthur C. Clarke.
      Whenever you find yourself numbering individual variables instead of giving them names, you should recognise that what you ought to be using is an array.
      Obligatory reading for more of such fundamental advice: Mark-Jason Dominus' excellent Program Repair Shop and Red Flags article series on Perl.com.

      Makeshifts last the longest.

        Yes, I realize this. I guess I've been programming in languages where dynamic arraying is impractical for so long that I forget that perl has such flexibility.

        I didn't want to declare an array of a fixed size... and if I did, I didn't want to use an explicit for-loop (not sure why not, but the little voice in the back of my head said "hey, this is perl... you probably don't need a for loop to do that... it can probably happen in one line... it's probably a really short line, you idiot!")

        Thanks for the resources, I'll read up on them! (couldn't wait, read the first... wow... so many of my programs to unbreak)
      Or probably the simplest

      my @d = $field =~ m[.{1,$n}]g;

      OH... it is beautiful! I just plugged it into "filename -> hash" function to convert filenames such as "access_log.03020700.gz" into hashes that are something like:
      { type => "access_log", year => "03", month => "02", day => "07", hour => "00" compressed => "1" }
      ... needless to say, logfile munging of usage statistics is going splendidly now!

      I feel like I've graduated from 1st grade of perl primary school... I'm actually Extracting Practical Reports from data!
Re: One liner: split fixed-width field into equal length chunks...
by Zaxo (Archbishop) on Mar 21, 2003 at 07:32 UTC

    Just for oddity. One-linedness is a cheat and needs 5.8 to do the string-as-file magic. {local $/ = \$n; open my $vfh, '<', \$field; @foo = <$vfh>;}

    After Compline,
    Zaxo

Re: One liner: split fixed-width field into equal length chunks...
by pfaut (Priest) on Mar 21, 2003 at 03:09 UTC
    #!/usr/bin/perl -w use strict; sub fixedsplit { my ($data,$len) = @_; ($data =~ /(.{1,$len})/g); } my $data = "abcdefghijklmnopqrstuvwxyz"; my @fields; @fields = fixedsplit($data,4); print "@fields\n"; @fields = fixedsplit($data,6); print "@fields\n";

    Produces:

    abcd efgh ijkl mnop qrst uvwx yz abcdef ghijkl mnopqr stuvwx yz
    --- print map { my ($m)=1<<hex($_)&11?' ':''; $m.=substr('AHJPacehklnorstu',hex($_),1) } split //,'2fde0abe76c36c914586c';
      Yes, this is clearer, and is actually pretty close to what I settled on (before the latter Monks replied). Thanks, pfaut!
Re: One liner: split fixed-width field into equal length chunks...
by robartes (Priest) on Mar 21, 2003 at 06:42 UTC
    This anwser disqualifies itself as a one-liner from the start. If you're looking for a good one liner, BrowserUK's last answer is a beauty. However, in the spirit of TIMTOWTDI and 'getting to know your unpack', curiosity piqued me to use unpack to split a string:
    use strict; use Data::Dumper; my $string="abcdefghijklmnopqrstuvwxyz"; my $aryref= []; chunk_em($string,$aryref,5); print Dumper($aryref); sub chunk_em { my $string=shift; my $aryref=shift; my $num=shift; my ($first, $second)=unpack("A${num}A*", $string); push @$aryref, $first; chunk_em($second,$aryref,5) if $second; return $aryref; } __END__ $VAR1 = [ 'abcde', 'fghij', 'klmno', 'pqrst', 'uvwxy', 'z' ];
    Not a one liner by any stretch of imagination, so this is FYI only.

    CU
    Robartes-

(dkubb) Re: (1) One liner: split fixed-width field into equal length chunks...
by dkubb (Deacon) on Mar 21, 2003 at 22:51 UTC

    If you're using perl 5.8 you can use the new sub-template function in unpack:

    my @d = unpack "(A$n)*", $string;

    Dan Kubb, Perl Programmer

Re: One liner: split fixed-width field into equal length chunks...
by OM_Zen (Scribe) on Mar 21, 2003 at 18:09 UTC
    Hi ,

    This shall also do the things you require

    my $a = "HereIsTheStringAsString"; my $n = 5; while($a =~/.{1,$n}/){print "$&\n"; $a = $'}; __END__ HereI STheS tring AsStr ing

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://244788]
Approved by Jazz
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (9)
As of 2024-04-23 14:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found