http://qs321.pair.com?node_id=299838


in reply to Re: Categorize files in directory
in thread Categorize files in directory

Here are couple of sample files.

file 1)(full occured 3 times vs. the second file where it occured only once!

FULL322418809544200 444FM15852298FP02 1033019970623200307072003EI +CHIN + BEDFORD MA01731 484BB02785SUN MID-ATLANTI + 919 EAST MAIN ST. HDQ 4309 RICHMOND + VA23219 FULL322418809544200 TX79902 OA910UT +17124QWEST PHOENIX RHPS 20 E THOMAS ST 5TH FLOOR + PHOENIX AZ85012 + FULL322418809544200 444FM15 +852298FJ02 1033019970707200307072003EICHIN ST. + HDQ 4309 RICHMOND VA23212 + &&

File 2) FULL occured only once.

FULL322418214114900 444FM15852013FI02 1120619930326200307012003FA +RAGO 458ON08815FLEET 8022667304690 NAAMANS RD + CLAYMONT DE197 +03 &&
thanks!

Each Files ends with "&&"

Edit, BazB added code tags.

Replies are listed 'Best First'.
Re: Re: Re: Categorize files in directory
by davido (Cardinal) on Oct 16, 2003 at 18:59 UTC
    You haven't really told us what you want to do with the files. "Separate"... does that mean put them into different directories? Hold their filenames in different arrays? Break the ones with multiple headers into multiple files? We only know what you tell us.

    Once you've opened the directory and gotten a list of files, you could use this snippet to proceed.

    FILE_LIST: foreach my $filename ( @files ) { open FILE, "<", $filename or die "Can't open $filename. $!\n"; my $headers = 0; while ( my $line = <FILE> ) { $headers++ if $line =~ /\bFULL\b/; if ( $headers > 1 ) { # do whatever it is you intend to do with # multi-header'ed files. next FILE_LIST; } } # Do whatever it is you intended to do with # single-header files. } continue { close FILE; } # Now you're done.

    As you can see, Regular Expressions! are only a very small part of making this thing work for you.


    Dave


    "If I had my life to do over again, I'd be a plumber." -- Albert Einstein
Re: Re: Re: Categorize files in directory
by diotalevi (Canon) on Oct 16, 2003 at 18:57 UTC

    Unless I'm mistaken, this snippet should go a good ways toward solving your problem.

    use File::Spec::Functions; use File::Copy; our @files = glob "*.txt"; mkdir for qw( multiple single ); for my $file ( @files ) { if ( 1 > number_of_headers( $file ) ) { move( $file, catfile( "multiple", $file ) ); } else { move( $file, catfile( "single", $file ) ); } } sub number_of_headers { my $file = shift; my $count = () = slurp( $file ) =~ /FULL/g; return $count; } sub slurp { my $file = shift; local $/; local *SLURP; open SLURP, "<", $file or die "Couldn't open $file for reading: $! +"; my $content = <SLURP>; close SLURP or warn "Couldn't close $file: $!"; return $content; }
Re: Re: Re: Categorize files in directory
by sunadmn (Curate) on Oct 16, 2003 at 18:46 UTC
    Ok so let's get this straight then you want to READ the files and depending o the number of "FULL" lines in them sort the files into different directories or just a file with a count of how man of each there are??

      I have used the code by l2kashe:

      Looks like its moving everything from one directory to the other.

      #!/usr/bin/perl use strict; my $data = '/path/to/base/dir'; my $multi = '/path/to/multi/full/dir'; my $single = '/path/to/single/full/dir'; opendir(DATA, $data) or die "Cant opendir $data: $!\n"; for ( grep !/^\./, readdir(DATA) ) { my $in = "$data/$_"; open(IN, $in) or die "Cant open $in: $!\n"; my $count = grep /^(?:\s+|)FULL/, <IN>; close(IN); if ( $count >= 2 ) { rename($in, "$multi/$_") or (warn "Couldnt move $_ to $multi: $!\n" and next); } else { rename($in, "$single/$_") or (warn "Couldnt move $_ to $single: $!\n" and next); } } # END for grep closedir(DATA)

      Edit, BazB: close code tag, remove random characters.

        Im going to go out on a limb, and make an assumption that files are the only files in said directory, and that 'full' will only occur around the beginning of a line. With that said

        #!/usr/bin/perl use strict; my $data = '/path/to/base/dir'; my $multi = '/path/to/multi/full/dir'; my $single = '/path/to/single/full/dir'; opendir(DATA, $data) or die "Cant opendir $data: $!\n"; for ( grep !/^\./, readdir(DATA) ) { my $in = "$data/$_"; open(IN, $in) or die "Cant open $in: $!\n"; my $count = grep /^(?:\s+|)FULL/, <IN>; close(IN); if ( $count >= 2 ) { rename($in, "$multi/$_") or (warn "Couldnt move $_ to $multi: $!\n" and next); } else { rename($in, "$single/$_") or (warn "Couldnt move $_ to $single: $!\n" and next); } } # END for grep closedir(DATA);

        Edit: added test for success on rename :P..

        use perl;

        I just need to copy files from the source into 2 different directories + one with single header and the other with multiple headers! ANy help would be appreciated...thanks a lot! Global symbol "$data" requires explicit package name at test3_final.pl + line 4. Global symbol "$ONE" requires explicit package name at test3_final.pl +line 5. Global symbol "$TWO" requires explicit package name at test3_final.pl +line 6. Global symbol "$data" requires explicit package name at test3_final.pl + line 8. Global symbol "$data" requires explicit package name at test3_final.pl + line 8. Global symbol "$data" requires explicit package name at test3_final.pl + line 11. Global symbol "$TWO" requires explicit package name at test3_final.pl +line 18. Global symbol "$ONE" requires explicit package name at test3_final.pl +line 21. Execution of test3_final.pl aborted due to compilation errors. use strict; $data = "/export/home/credpol03/e7uryp/PERL/test"; $ONE = "/export/home/credpol03/e7uryp/PERL/ONE"; $TWO = "/export/home/credpol03/e7uryp/PERL/TWO"; opendir(DATA, $data) or die "Cant opendir $data: $!\n"; for ( grep !/^\./, readdir(DATA) ) { my $in = "$data/$_"; open(IN, $in) or die "Cant open $in: $!\n"; my $count = grep /^(?:\s+|)FULL/, <IN>; close(IN); if ( $count >= 2 ) { rename($in, "$multi/$_"); } else { rename($in, "$single/$_"); } } # END for grep closedir(DATA);