http://qs321.pair.com?node_id=634273

uvnew has asked for the wisdom of the Perl Monks concerning the following question:

Hello, my question is probably extremely trivial, but I couldn't find it on the guidebook. So I would be grateful for your help. I have a very long string, of more than 100,000 characters (lets say 100,400 characters). I want to loop through that string, each time saving only chunks of 1000 characters. So for example- for the first iteration $temp will get characters 1-1000, second iteration $temp will get 1001-2000, and so on, then last iteration $temp will get 100,001-100,400. Thank you very much for any suggestion.

Replies are listed 'Best First'.
Re: Splitting a long string
by GrandFather (Saint) on Aug 22, 2007 at 08:51 UTC

    For a modest size string like that:

    for my $temp ($str =~ /.{1,1000}/g) { # do stuff with $temp }

    is probably sufficiently fast.

    Update: s/1000/1,1000/ per almut's catch.


    DWIM is Perl's answer to Gödel

      Minor nitpick: to get the last block of 400 chars as well, that regex should be /.{1,1000}/g, i.e. min 1, max 1000.

      Yep, that's perfect. Thank you all!
Re: Splitting a long string
by dug (Chaplain) on Aug 22, 2007 at 10:51 UTC

    Another way to do this is to treat your data as an "in memory" file (see perldoc -f open). Setting the INPUT_RECORD_SEPARATOR variable to a reference to an integer will make the readline operator read that many bytes, or as many as it can until EOF. Putting that all together gives us:

    #!/usr/bin/perl use warnings; use strict; my $string = "x" x 100_400; { open ( my $sth, "<", \$string ) or die $!; local $/ = \1_000; while ( my $chunk = <$sth> ) { print $chunk, "\n"; } }
    -- Douglas Hunter
Re: Splitting a long string
by holli (Abbot) on Aug 22, 2007 at 11:38 UTC
    There are basically two ways. One is destructive to the original string, the other is not but means more code:
    my $long_string = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"; my $i = 0; my $readlength = 8; # non destructive while ( $i <= length($long_string) ) { my $part = substr( $long_string, $i, $readlength ); $i += $readlength; print "$part\n"; } # destructive while ( $long_string =~ s/^(.{1,$readlength})// ) { my $part = $1; print "$part\n"; }


    holli, /regexed monk/
Re: Splitting a long string
by akho (Hermit) on Aug 22, 2007 at 08:53 UTC
    I have not actually understood your goal but
    $temp = substr($line, $i * 1_000, 1000);
    will give you the substrings you seem to want.