Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical

Re: Categorize files in directory

by l2kashe (Deacon)
on Oct 16, 2003 at 18:31 UTC ( [id://299836] : note . print w/replies, xml ) Need Help??

in reply to Categorize files in directory

Can you give sample files? They don't need to contain your data, as long as they are in the right format. There are many was to accomplish what you would like to do. Unfortunately there is no way to help without more information.

use perl;

Replies are listed 'Best First'.
Re: Re: Categorize files in directory
by kris2000 (Initiate) on Oct 16, 2003 at 18:42 UTC

    Here are couple of sample files.

    file 1)(full occured 3 times vs. the second file where it occured only once!

    FULL322418809544200 444FM15852298FP02 1033019970623200307072003EI +CHIN + BEDFORD MA01731 484BB02785SUN MID-ATLANTI + 919 EAST MAIN ST. HDQ 4309 RICHMOND + VA23219 FULL322418809544200 TX79902 OA910UT +17124QWEST PHOENIX RHPS 20 E THOMAS ST 5TH FLOOR + PHOENIX AZ85012 + FULL322418809544200 444FM15 +852298FJ02 1033019970707200307072003EICHIN ST. + HDQ 4309 RICHMOND VA23212 + &&

    File 2) FULL occured only once.

    FULL322418214114900 444FM15852013FI02 1120619930326200307012003FA +RAGO 458ON08815FLEET 8022667304690 NAAMANS RD + CLAYMONT DE197 +03 &&

    Each Files ends with "&&"

    Edit, BazB added code tags.

      You haven't really told us what you want to do with the files. "Separate"... does that mean put them into different directories? Hold their filenames in different arrays? Break the ones with multiple headers into multiple files? We only know what you tell us.

      Once you've opened the directory and gotten a list of files, you could use this snippet to proceed.

      FILE_LIST: foreach my $filename ( @files ) { open FILE, "<", $filename or die "Can't open $filename. $!\n"; my $headers = 0; while ( my $line = <FILE> ) { $headers++ if $line =~ /\bFULL\b/; if ( $headers > 1 ) { # do whatever it is you intend to do with # multi-header'ed files. next FILE_LIST; } } # Do whatever it is you intended to do with # single-header files. } continue { close FILE; } # Now you're done.

      As you can see, Regular Expressions! are only a very small part of making this thing work for you.


      "If I had my life to do over again, I'd be a plumber." -- Albert Einstein

      Unless I'm mistaken, this snippet should go a good ways toward solving your problem.

      use File::Spec::Functions; use File::Copy; our @files = glob "*.txt"; mkdir for qw( multiple single ); for my $file ( @files ) { if ( 1 > number_of_headers( $file ) ) { move( $file, catfile( "multiple", $file ) ); } else { move( $file, catfile( "single", $file ) ); } } sub number_of_headers { my $file = shift; my $count = () = slurp( $file ) =~ /FULL/g; return $count; } sub slurp { my $file = shift; local $/; local *SLURP; open SLURP, "<", $file or die "Couldn't open $file for reading: $! +"; my $content = <SLURP>; close SLURP or warn "Couldn't close $file: $!"; return $content; }
      Ok so let's get this straight then you want to READ the files and depending o the number of "FULL" lines in them sort the files into different directories or just a file with a count of how man of each there are??

        I have used the code by l2kashe:

        Looks like its moving everything from one directory to the other.

        #!/usr/bin/perl use strict; my $data = '/path/to/base/dir'; my $multi = '/path/to/multi/full/dir'; my $single = '/path/to/single/full/dir'; opendir(DATA, $data) or die "Cant opendir $data: $!\n"; for ( grep !/^\./, readdir(DATA) ) { my $in = "$data/$_"; open(IN, $in) or die "Cant open $in: $!\n"; my $count = grep /^(?:\s+|)FULL/, <IN>; close(IN); if ( $count >= 2 ) { rename($in, "$multi/$_") or (warn "Couldnt move $_ to $multi: $!\n" and next); } else { rename($in, "$single/$_") or (warn "Couldnt move $_ to $single: $!\n" and next); } } # END for grep closedir(DATA)

        Edit, BazB: close code tag, remove random characters.