Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re: Splitting a large file into smaller files according to indexes

by bliako (Prior)
on Apr 05, 2018 at 20:03 UTC ( #1212374=note: print w/replies, xml ) Need Help??


in reply to Splitting a large file into smaller files according to indexes

I have commented some lines in your program and added some others as follows:
#!/usr/bin/env perl use strict; use warnings; use Data::Dumper ; my $source = shift ; my $lines_per_file = shift ; open (my $FH, "<$source") or die "Could not open source file. $!"; # open (my $OUT, '>', '00000000.log') or die "Could not open destinati +on file. $!"; my $OUT = undef; my $i = 0; #my $index_last = 0 ; my $index_current = -1; while(my $line = <$FH>) { next unless ($line =~ /mrule/) ; if ($line =~ /mrule=([0-9]+)/){ if( $index_current != $1 ){ $index_current = $1; if( defined($OUT) ){ close($OUT); } my $NEW = sprintf("%08d", $index_current); open($OUT, ">${NEW}.log") or die "Could not open destinati +on file. $! " ; } print $OUT $line; $i++ ; # if ($1 != $index_last){ # $index_current = $1 ; # close($OUT); # my $NEW = sprintf("%08d", $index_current); # open($OUT, ">${NEW}.log") or die "Could not open destinat +ion file. $! " ; # } # $index_last = $index_current ; } } close($FH); #close($OUT); if( defined($OUT) ){ close($OUT); }

The above will be looking for an mrule=[0-9]+ pattern in the input. Once it finds one, it will check if the current index is the same as the one in the line and if not, it will close current filehandle and open another one with the new name. After that, it will print to the filehandle currently opened.

Note that no filehandle is opened unless the pattern in the input appears.

tested with the minimal file you had provided.

bliako

Replies are listed 'Best First'.
Re^2: Splitting a large file into smaller files according to indexes
by cryptoperl (Novice) on Apr 05, 2018 at 20:43 UTC

    Works like a charm, thank you. How do I upvote this answer, sorry I am a newbie to perlmonks :)
    Also, why do we initialize, $current_index as -1?

        The tips say that I will see a radio button on the replies, but I cannot see any. Probably the case that I don't yet have the voting rights

        The tips say that I have to look for radio button on the replies, but I cannot see any. So probably must be the case that I don't yet have the power to vote.

      it is initialised to -1 (or whaterver other value your mrule number will not take) in order to force the opening of the file upon seeing the mrule pattern for the first time.

      Glad it worked (it's Perl after all) but I stress that it is untested by me for more complex cases.

      bliako

        As an example of where the above program may go wrong, consider what happens if mrule=140 appears in two different parts of your log file - NOT consecutively, i.e. another mrule line is in between. The last time's contents will overwrite the first time's contents!

        bliako

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1212374]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (4)
As of 2020-11-28 17:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?