Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

splitting file with mutiple output filenames

by preahkumpii (Novice)
on Apr 11, 2012 at 09:11 UTC ( #964480=perlquestion: print w/replies, xml ) Need Help??

preahkumpii has asked for the wisdom of the Perl Monks concerning the following question:

I have a file, each line needs to be put into another file and the filename taken from the first few chars of each line in the source file. Here is what I have so far:
#!/usr/bin/perl # chess.plx use warnings; use strict; use utf8; open THAI, "thaikjv-fixed.txt" or die $!; binmode(THAI, ":utf8"); foreach(<>) { if (/^@(...\d\d\d)/) { my $r = "$1.html"; open O, ">$r" or die $!; binmode(O, ":utf8"); print O "$_"; print "hi"; close (O); } }
Thanks for any help.

Replies are listed 'Best First'.
Re: splitting file with mutiple output filenames
by MidLifeXis (Monsignor) on Apr 11, 2012 at 12:37 UTC

    What happens if you get the same value in two different matches?

    open O, ">$r" or die $!;
    You may want to open the file in append mode: open O, '>>', $r or die $!;

    --MidLifeXis

Re: splitting file with mutiple output filenames
by preahkumpii (Novice) on Apr 11, 2012 at 09:17 UTC
    Ok, that was a serious and embarrasing 'duh' moment. I forget to put the input filehandle name after foreach. So it should be: foreach (<THAI>) { Sorry for the waste of time.

      Ok, you solved problem, but let's make the post worth while by passing on the usual round of "wisdom". Consider:

      #!/usr/bin/perl # chess.plx use warnings; use strict; use utf8; my $filename = "thaikjv-fixed.txt"; open my $fIn, '<', $filename or die "Can't open $filename: $!\n"; binmode ($fIn, ":utf8"); while (defined (my $line = <$fIn>)) { next if $line !~ /^@(...\d\d\d)/; my $outname = "$1.html"; open my $fOut, '>', $outname or die "Can't create $outname: $!\n"; binmode ($fOut, ":utf8"); print $fOut $line; print "hi"; close $fOut; }

      Note:

      • use three parameter version of open and lexical file handles
      • show the file name in errors
      • use a while loop instead of a for loop when reading files
      • use early exit to avoid extra levels of indentation
      • avoid using the default variable across multiple lines
      True laziness is hard work
      I'd suggest to use while(<FH>) instead of foreach(<FH>). (while loop reads directly from file, for(each) loop slurps the entire file into memory first - really bad for huge files, but ok for small ones).

      Maybe something like this:
      use strict; use warnings; open my $thai_fh, '<:utf8', "thaikjv-fixed.txt" or die $!; while(<$thai_fh>){ if (/^@(...\d\d\d)/) { open my $out_fh, '>:utf8', "$1.html" or die $!; print {$out_fh} $_; } }
      /code
Re: splitting file with mutiple output filenames
by snape (Pilgrim) on Apr 11, 2012 at 10:05 UTC

    This should work for you

    #!/usr/bin/perl use warnings; use strict; use utf8; my $filename = "thaikjv-fixed.txt"; open my $THAI, $filename or die $!\n"; binmode ($THAI, ":utf8"); while (<$THAI>) { chomp $_; if ($_ =~ m/^@(...\d\d\d)/){ my $r = "$1.html"; open my $OUT, '>', $r or die $!\n"; binmode ($OUT, ":utf8"); print $OUT $line; close($OUT); } } close($THAI);

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://964480]
Approved by GrandFather
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (2)
As of 2023-06-09 02:13 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    How often do you go to conferences?






    Results (35 votes). Check out past polls.

    Notices?