Warning, untested code:
my %stathash;
while (<FH>) {
$stathash{$_}++;
}
Has the extra advantage of counting the number of hits for each unique value.
You can then do some grooy stuff, like pulling out records which occur n times, records which appear in one set and not another (if you use two hashes, two datasets), or records which appear in both sets, (again,if you use two hashes, two datasets)
I regularly do this with sets of about 500k records to determine where my data integrity issues lie, its pretty damn fast.