comment on

Here's one way to do it. Does not require much memory, and should be fairly fast. Not tested on 1MB files. :)

#!perl
use strict;

## Files to be loaded:
my @files = qw(file1.txt file2.txt file3.txt);

## New file to create:
my $newfoo = "Bigfile.txt";

my ($file, $number, %phone, %serial, %change);

{
local $/='';
for $file (@files) {
  open(FOO, "$file") or die "Could not open $file: $!\n";
  while(<FOO>) {
    ## Check for duplicate phone number
    /^Phone (.*)/m or die "No phone for record $.\n";
    $phone{$1}++ and $change{"$file$."}=-1 and next;

    ## If phone number cool, check the serial number:
    /^SERIAL NUMBER (\d+)/ or die "Bad serial number for record $.\n";
    ## Make sure it is unique, if not, add one
    if ($serial{$1}++) {
      $number=$1;
      1 while $serial{++$number};
      $serial{$number}++;
      $change{"$file$."}=$number;
    }
}
close(FOO);
} ## go to next file in the list

## Loop through them all again, this time for keeps :)
open(NEWFOO, ">$newfoo") or die "Could not create $newfoo: $!\n";
for $file (@files) {
  open(FOO, "$file") or die "Could not open $file: $!\n";
  while(<FOO>) {
    if ($change{"$file$."}) {
      $change{"$file$."}==-1 and next;
      s/^SERIAL NUMBER (\d+)/SERIAL NUMBER $change{"$file$."}/;
    }
    print NEWFOO $_;
  }
}
close(NEWFOO);
} ## end generic loop
[download]

In reply to Re: Efficiency and Large Arrays by turnstep
in thread Efficiency and Large Arrays by Kozz

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


"be consistent"
	PerlMonks