robbiebow has asked for the wisdom of the Perl Monks concerning the following question:
Wise ones, I seek assistance!
Having returned to PERL after not long enough on glorious beaches, I'm floundering with a nested array of hashes. I'll try to explain:
I have a pipe-delimited text file, each line representing a 'Category'. Within each Category is a number of Items, each with a description and reference number. Here's an example:
'Coffee Suppliers','BB100'|'Tea Suppliers','BB106'
'Plasterers','BL100'|'Fencing Companies','BL102'
I also have another file with the Category descriptions in. For example:
'BB','Drinks Industry'
'BL','Construction Industry'
What I want to do is:
(a) reverse the order of first set of records so that the reference is first, then the description (so I can use these as hashes / associative arrays when I read the data normally); and
(b) merge the two sets with the second set of data being popped onto the beginning of each category array.
(As you have guessed, the first two characters of the Item reference are the same as the Category reference.)
The outcome would be:
'BB','Drinks Industry'|'BB100','Coffee Suppliers'|'BB106','Tea Suppliers'
'BL','Construction Industry'|'BL100','Plasterers'|'BL102','Fencing Companies'
Any advice gratefully received.
Re: Multidimensional array of hashes
by broquaint (Abbot) on Oct 03, 2003 at 12:03 UTC
|
A liberal use of map and grep and this is quite simple
use File::Slurp;
my @cats = map {
chomp;
map [reverse split ','], split /\|/;
} read_file('your_category_file');
my %desc = map {
chomp;
split ',', $_, 2;
} read_file('your_description_file');
my @res = map {
my $d = $_;
join '|',
"$d,$desc{$d}",
map { join ',', @$_ }
grep { $_->[0] =~ /^$d?/ } @cats
} keys %desc;
print "$_\n" for @res;
__output__
'BB','Drinks Industry'|'BB100','Coffee Suppliers'|'BB106','Tea Supplie
+rs'
'BL','Construction Industry'|'BL100','Plasterers'|'BL102','Fencing Com
+panies'
So we build an array of categories (with fields reveresed), then a hash of descriptions, then create an array of strings for every description, prime for outputting to a file.
| [reply] [d/l] |
Re: Multidimensional array of hashes
by Roger (Parson) on Oct 03, 2003 at 11:54 UTC
|
Because your data files are very simple, only one hash table is required to hold the industry lookup. I have written a quick and simple code to do what you described. Keep the script simple and straight forward.
use strict;
use IO::File;
my %industries;
# Load the industry definition file
my $def = new IO::File "def.txt", "r" or die "Can not open file!";
while (<$def>) {
if (m/'(.*?)','(.*?)'/) {
$industries{$1} = $2;
}
}
undef $def; # close the file
# Load the data file
my $dat = new IO::File "dat.txt", "r" or die "Can not open file!";
while (<$dat>) {
s/('.*?'),('.*?')/$2,$1/g; # swap the items
my $id = substr($_, 1, 2);
print "'$id','$industries{$id}'|$_";
}
undef $dat;
The input files are -
-- def.txt --
'BB','Drinks Industry'
'BL','Construction Industry'
-- dat.txt --
'Coffee Suppliers','BB100'|'Tea Suppliers','BB106'
'Plasterers','BL100'|'Fencing Companies','BL102'
And the output is -
'BB','Drinks Industry'|'BB100','Coffee Suppliers'|'BB106','Tea Supplie
+rs'
'BL','Construction Industry'|'BL100','Plasterers'|'BL102','Fencing Com
+panies'
Cheers. Roger | [reply] [d/l] [select] |
Re: Multidimensional array of hashes
by jonadab (Parson) on Oct 03, 2003 at 12:53 UTC
|
Other monks have shown you ways of dealing with this
specific problem. I'd like to comment on your node
title, multidimensional arrays of hashes. Perl's
system of handling such things using references is
really cool, IMO. Because the elements of an array
or the values of a hash can hold references, and
because a reference can point to an array or a hash,
it is possible to nest these structures to arbitrary
depth, creating such things as, as you put it,
multidimensional arrays of hashes. That's not the
cool part. The cool part is that when you put
the array and hash subscripts back to back, the
dereferencing is automatically implied. Thus,
you can just do things like this:
$db{BB}{industry} = "Drinks Industry";
$db{BB}{supplier}{BB100} = "Coffee Suppliers";
The above does exactly what you would want it to do.
(In practice, rather than filling in individual
values as above, you could fill in the data while
reading your input file, or something like it.)
Now, you want a list of all the industries? No
problem...
my @industrycodes = sort keys %db;
foreach $i (@industrycodes) {
print "$i => $db{$i}{industry}\n";
}
And if you also want the suppliers? Again,
no problem...
my @industrycodes = sort keys %db;
foreach $i (@industrycodes) {
print "Suppliers in the $db{$i}{industry}:\n";
my %s = %$db{$i}{suppliers};
# Note that: We wanted the hash of suppliers,
# so we had to dereference with the hash sigil, %
# The dereferencing of the references used for the
# nesting is all handled automatically by the
# subscripting syntax, so you don't have to mess
# with it, but when you want to retrieve something
# other than a scalar (e.g, an array or hash),
# you do have to dereference the retrieved result.
foreach $supplier (sort keys %s) {
print "\t$supplier => $s{$supplier}\n";
}
}
$;=sub{$/};@;=map{my($a,$b)=($_,$;);$;=sub{$a.$b->()}}
split//,".rekcah lreP rehtona tsuJ";$\=$ ;->();print$/
| [reply] [d/l] [select] |
Re: Multidimensional array of hashes
by Rhys (Pilgrim) on Oct 03, 2003 at 13:29 UTC
|
You can use a single hash to store all of this:
open CATEGORY, "categories.txt"; # ...or whatever...
foreach $line ( <CATEGORY> ) {
@ITEMS = split /\|/, $line;
foreach $item ( @ITEMS ) {
($value, $key) = split /,/, $item;
$LIST{$key} = $value;
}
}
close CATEGORY;
open DESC, "descriptions.txt";
# Same code as above, except key/value pairs are reversed.
foreach $line ( <DESC> ) {
@ITEMS = split /\|/, $line;
foreach $item ( @ITEMS ) {
($key, $value) = split /,/, $item;
$LIST{$key} = $value;
}
}
close DESC;
# Now arrange for the output.
open OUTFILE, ">output.txt";
foreach $reference ( sort keys %LIST ) {
# Sometimes, it's a new category. These are most important.
if ( $reference =~ /^\'\w\w\'$/ ) {
print OUTFILE "\n";
} else {
print OUTFILE "|";
}
print OUTFILE "$reference,$LIST{$reference}";
}
That should do it. Since they're sorted, your category references will always precede your item references. Since the code above checks for that first, you'll start the new line just in time to deal with that item.
Every printed item is preceded by the delimiter it needs. Newlines for a new category, pipes for a new item. That way you don't get extra trailing separators.
There's no error checking code in here, though. If your two input files aren't pristine, you'll get some junk in there. Also, your category references *must* be two alpha characters. If you change that, you might need to change the pattern for recognizing a new category to something more flexible like simply ruling out an item:
if ( $reference !~ /^\'\w+\d+\'$/ ) {
# ...
}
If it isn't an item, maybe it's a category. :-)
Hope that helps!
--Rhys
edited: Sat Oct 4 12:59:29 2003
by jeffa - s/pre/code/ig
| [reply] [d/l] [select] |
Re: Multidimensional array of hashes
by bm (Hermit) on Oct 03, 2003 at 11:55 UTC
|
Can you show what you have already tried?
| [reply] |
|
|