Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

File splitting script

by Alien (Monk)
on Apr 21, 2007 at 13:12 UTC ( [id://611292]=sourcecode: print w/replies, xml ) Need Help??
Category: Utility Scripts
Author/Contact Info
Description: A simple script that can be used to split files.
For the chunk size parameter , you can specify megabytes or gigabytes. Here is an example on how the script can be runned:

perl split.pl linux.iso 100M linux_chunks

where linux.iso is a cd image we're splitting in 100MB chunks , and we're storing them in the directory linux_chunks. Hope this helps !
use strict;
use warnings;

my $f=shift || die "File that your want to split\n";
my $chunk_size=shift || die "Chunk size\n";
my $dir=shift || die "Directory where the chunks are stored\n";
my $fsize=-s $f;
my $size=0;
my $buffer;
my $BUFFER_SIZE=4096;
my $nr_of_chunks=0;
my $nr=0;

if($chunk_size=~/(\d+)([mMgG])/) {
    my $numb=$1;
    my $tp=$2;
    SW: {
        ($tp eq "g" || $tp eq "G") && do {
            $numb*=1024*1024*1024;
            $size=$numb;
            last SW;
        };
        ($tp eq "m" || $tp eq "M") && do {
            $numb*=1024*1024;
            $size=$numb;
            last SW;
        };
    }
}
else {
    $size=$chunk_size;
}

open(F,$f) || die "OPEN : $!\n";
binmode(F);
if(! -e $dir) {
    mkdir($dir) || die "DIR : $!\n";
}
chdir($dir) || die "CHDIR : $!\n";

FILE:{
if(($fsize - $size)<=0) { 
    my $count=0;
    print "Creating file $nr\n";
    open(C,">$nr") || die "CREATE : $!\n";
    binmode(C);
        while($count != $fsize) {
        $count+=read(F,$buffer,$BUFFER_SIZE);
        print C $buffer;
    }
    close C;
    last FILE;
}
else {
    my $count=0;
    print "Creating file $nr\n";
    open(C,">$nr") || die "CREATE : $!\n";
    binmode(C);
        while($count != $size) {
        $count+=read(F,$buffer,$BUFFER_SIZE);
        print C $buffer;
    }
    close C;
    $fsize-=$size;
    $nr++;
    redo FILE;
}
}

print "Over & Out\n";
and the script needed to join the files
use strict;
use warnings;

my $dir=shift || die "Directory where the chunks are stored\n";
my $out=shift || die "Name under which we join the chunks\n";
my @files;
my $buffer;
opendir(D,$dir) || die "OPENDIR : $!\n";
@files=grep { /^\d+$/ } readdir(D);
closedir(D);
chdir($dir) || die "CHDIR : $!\n";
open(F,">../$out") || die "CREATE : $!\n";
binmode(F);
@files=sort { $a <=> $b } @files;
for(@files) {
    open(C,$_) || die "OPEN : $!\n";
        binmode(C);
    while(read(C,$buffer,4096)) {
        print F $buffer;
    }
    print "I joined chunk nr. $_\n";
    close C;
}
print "File $out is complete now\n";
close F;
Replies are listed 'Best First'.
Re: File splitting script
by jdporter (Paladin) on Apr 21, 2007 at 17:29 UTC

    Nice.

    This seems to be the same idea as the UNIX tool split. Which, by the way, as a pure Perl implementation in the Perl Power Tools project.

    Um... This seems rather awkward:

    if($chunk_size=~/(\d+)([mMgG])/) { my $numb=$1; my $tp=$2; SW: { ($tp eq "g" || $tp eq "G") && do { $numb*=1024*1024*1024; $size=$numb; last SW; }; ($tp eq "m" || $tp eq "M") && do { $numb*=1024*1024; $size=$numb; last SW; }; } } else { $size=$chunk_size; }

    I think the following would be bit more natural:

    if ( $chunk_size =~ /(\d+)g/i ) { $size = $1 * 1024*1024*1024; } elsif ( $chunk_size =~ /(\d+)m/i ) { $size = $1 * 1024*1024; } else { $size = $chunk_size; }

    Of course, it could be further succinctified:

    $size = $chunk_size =~ /(\d+)g/i ? $1 * 1024*1024*1024 : $chunk_size =~ /(\d+)m/i ? $1 * 1024*1024 : $chunk_size =~ /(\d+)k/i ? $1 * 1024 : # easy additio +n $chunk_size;
    A word spoken in Mind will reach its own level, in the objective world, by its own weight
Re: File splitting script
by jwkrahn (Abbot) on Apr 21, 2007 at 21:05 UTC
    34 open(F,$f) || die "OPEN : $!\n"; 35 binmode(F); 41 FILE:{ 42 if(($fsize - $size)<=0) { 43 my $count=0; 44 print "Creating file $nr\n"; 45 open(C,">$nr") || die "CREATE : $!\n"; 46 while($count != $fsize) { 47 $count+=read(F,$buffer,$BUFFER_SIZE); 48 print C $buffer; 49 } 50 close C; 51 last FILE; 52 } 53 else { 54 my $count=0; 55 print "Creating file $nr\n"; 56 open(C,">$nr") || die "CREATE : $!\n"; 57 while($count != $size) { 58 $count+=read(F,$buffer,$BUFFER_SIZE); 59 print C $buffer; 60 } 61 close C; 62 $fsize-=$size; 63 $nr++; 64 redo FILE; 65 } 66 }

    You binmode the input filehandle but you don't binmode the output filehandle?

    read may not return exactly $BUFFER_SIZE bytes so this could result in an infinite loop.

      Fixed thanks !

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: sourcecode [id://611292]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (1)
As of 2024-04-18 23:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found