http://qs321.pair.com?node_id=1034814


in reply to How to improve this data structure?

If you want to preserve the data, a database is definitely an interesting solution. I haven't tried using SQLLite and similar when I have a transient need for data --- traditional extract / transform / load situations.

If you know the number of region numbers is significantly less than the number of records, you could use multiple StatsArrays, one for each region number. Since you don't know how many there will be, the obvious solution is to use an array of arrays ... the outer array for the region, the inner array for records in that region. I only wish your regions were numbered 0, 1, 2, 3, ... but that's stretching things a bit, so lets just use a hash that maps region number to array element.

I haven't run it, so there may be bugs. When a new RegionNum comes along, you detect it isn';t in the hash, so you add a new entry with the next array slot, and provide an anonymous array for the records to go in. Then, when you need the records sorted, it is far easier to sort the subset with the same regionNum. Sorting NxM log(M) will be faster than sorting NM log(NM).

package RegionRecords; my %regionNums; my @all_regions; my $N = 0; sub add { my ( $record ) = @_; if ( ! exists $regionNums{$record->{RegionNum} }) { $regionNums{$record->{RegionNum} } = $N; $all_regions[$N] = []; $N++; } push $all_regions->[$regionNums{$record->{RegionNum} }], $record; } 1; package main; RegionRecord::add( {RegionNum=>$RegionNum, AR=>$AR[$RegionNum], BCR=>$BCR[$RegionNum]} ); print RegionRecord::allRecords();

As Occam said: Entia non sunt multiplicanda praeter necessitatem.