Currently you're slurping the entire data set into memory at once; not only that, you're copying possibly huge chunks of it several more times. If you can build your code around a
while() loop, and process each line at a time instead of slurping the entire file, you'd be much better off, memory-wise.
# instead of this:
my @lines = <DATA>;
# do something like this:
while my $line (<DATA>) {
...
}
Even if you only build
@matches in that loop and keep the rest of the code the same, you may be much better off (assuming you have few matches compared to the size of the dataset). Deleting arrays after you're done with them (use
my and arrange the code so they go out of lexical scope) will also help with memory reuse.
If you can more clearly explain what this code is supposed to do, we might be able to find a much more straightforward solution. As it is, the code seems to be doing the same thing over again several times in different ways before printing its final results.
Alan