Is this a reasonable data structure?

Theo has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Is this a reasonable data structure? by sauoq (Abbot) on Oct 23, 2003 at 20:31 UTC
`my @alldata (%caralo_l, %androno_j, %paterman_s);` ... Question: Is this really an array of hashes? Well, hardburn almost had it. It's actually just an array. Perl will flatten your hashes. If you wrote it as `my @alldata = (\%caralo_l, \%androno_j, \%paterman_s);` [download] you would have an array of hashes. It seems to me kind of a waste to repeat the KEY information in each hash, but I haven’t thought up a better way. Yes, it is a waste. You might consider just using an array of arrays and then keeping a hash where the keys are your column names and the values are their indexes in the array. Edit: Added the missing '=' in the assignment. I didn't notice its absence, at first, after cutting and pasting it from the OP. -sauoq "My two cents aren't worth a dime.";	[reply] [d/l] [select]
Re: Is this a reasonable data structure? by hardburn (Abbot) on Oct 23, 2003 at 20:13 UTC
Is this really an array of hashes No, it's a hash-of-hashes: `my %data = ( paterman_s => { title => 'Mr', first => 'Steve', last => 'Paterman', room => 101, phone => 100, email => 'stv@net.net', }, # And so on );` [download] Also, please watch the weird Microsoft quoting characters. ---- I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident. -- Schemer `:(){ :\|:&};:` Note: All code is untested, unless otherwise stated	[reply] [d/l] [select]
Re: Is this a reasonable data structure? by tadman (Prior) on Oct 23, 2003 at 20:32 UTC
Keep in mind that your assignment is totally broken. Missing equals sign aside, see what this does: `my @array = (%hash1, %hash2);` What you end up with is merely a list of the key/value pairs from `%hash1` and `%hash2`, not an array of hashes. As hardburn suggested, what you want is a hash of hashes (HoH): `use warnings; use strict; my @template; my %data; while (<>) { chomp; my @line = split(/\\|/, $_); if (!@template) { @template = @line; } else { my $key = lc($line[2]."_".substr($line[1],0,1)); @{$data{$key}}{@template} = @line; } } use Data::Dumper; print Dumper(\%data);` [download] I really have no idea how you were going to name your hashes like that, so I guessed. Note that you might have to fix the `$key` definition so that two "B.Smith" people don't collide.	[reply] [d/l] [select]
Re: Is this a reasonable data structure? by etcshadow (Priest) on Oct 23, 2003 at 20:46 UTC
Bearing in mind (as mentioned above) that these should be arrays of hash-references and not hashes (see docs for perlref and perlreftut)... It's a perfectly reasonable representation for the data, depending on how you want to handle the data. In my own work, I very frequently deal with arrays of hasherefs. Granted, though, it's not the only way, or necessarily the best. That's going to depend on what you want to do with it. This can be a good way to represent very simple objects or rows of a database table. However, if you want to have a data structure which, in and of itself, enforces the homogeneity of the individual rows/objects, then you can go with an array of array-refs: `my @data = ( [ 'Mrs', 'Linda', 'Carolo', '201', '148', 'she@borg.org' ], [ 'Mrs', 'Jean', 'Andronlo', '317', '167', 'j@alo.com' ], # etc );` [download] And, if you like, you can keep the column name => column index map in a hash: `my %index = ( title => 1, first => 2, last => 3, room => 4, phone => 5, email => 6, );` [download] and then you can reference the items in a row: `foreach my $row (@data) { print "$row->[$column{title}] $row->[$column{first}] $row->[$column +{last}]\n"; }` [download] There's also something in perl called a "psuedo hash" which is a means by which the language does what I showed above (using a name as an index in an array, rather than a number), but I'd avoid them if possible. Anyway, that (above) is just one other example of how you might store a table... ultimately the "best" method for storing your data table will be dictated by what you intend to do with it. However, the array of hashrefs is about the simplest, most flexible way (though some people might gripe about the "wasteful"ness (both in space and time) of using a hash-lookup for each element of each row). ------------ :Wq Not an editor command: Wq	[reply] [d/l] [select]
Re: Is this a reasonable data structure? by Roger (Parson) on Oct 24, 2003 at 00:34 UTC
Seems that nobody has pointed this out yet - there is a convenient way to access data stored in your flat file. By using DBI and DBD::CSV modules. use strict; use DBI; use DBD::CSV; use Data::Dumper; # Connect to CSV database my $dbh = DBI->connect("DBI:CSV:csv_sep_char=\\|") or die "Cannot connect: " . $DBI::errstr; $dbh->{'csv_tables'}->{'addressbook'} = {'file'=>'addressbook.txt' }; # load address book entries my $sth = $dbh->prepare("SELECT * FROM addressbook"); $sth->execute(); # store data in 2-tier hash table my %data; while (my $res = $sth->fetchrow_hashref()) # loop through data { # create hash to store details my %rec = map { $_ => $res->{$_} } @{$sth->{NAME}}; # create top level hash with last name as lookup key $data{$rec{"last"}} = \%rec; } # cleaning up $sth->finish; $dbh->disconnect; # inspect our result print Dumper(\%data); [download] The data file - `addressbook.txt --------------- title\|first\|last\|room\|phone\|email Mrs\|Linda\|Caralo\|201\|148\|she@borg.org Miss\|Jean\|Androno\|317\|167\|j@alo.com Mr\|Steve\|Paterman\|101\|100\|steve@net.net` [download] And the hash structure built with the above script: `$VAR1 = { 'Caralo' => { 'email' => 'she@borg.org', 'first' => 'Linda', 'last' => 'Caralo', 'title' => 'Mrs', 'phone' => '148', 'room' => '201' }, 'Paterman' => { 'email' => 'steve@net.net', 'first' => 'Steve', 'last' => 'Paterman', 'title' => 'Mr', 'phone' => '100', 'room' => '101' }, 'Androno' => { 'email' => 'j@alo.com', 'first' => 'Jean', 'last' => 'Androno', 'title' => 'Miss', 'phone' => '167', 'room' => '317' } };` [download]	[reply] [d/l] [select]
Re: Re: Is this a reasonable data structure? by BrowserUk (Patriarch) on Oct 24, 2003 at 01:00 UTC
Or he could do it in a quarter of the time with half the code, without having to download and compile modules or read 1000 lines of documentation and learn SQL. `#! perl -slw use strict; use Data::Dumper; my @fields = split'\\|', <DATA>; chomp $fields[-1]; my %HoH = map{ chomp; my%h; @h{ @fields } = split'\\|'; ( $h{ last } . '_' . substr( $h{ first }, 0, 1 ) => \%h ) } <DATA>; print Dumper \%HoH; __DATA__ title\|first\|last\|room\|phone\|email Mrs\|Linda\|Caralo\|201\|148\|she@borg.org Miss\|Jean\|Androno\|317\|167\|j@alo.com` [download] prints `P:\test>test2 P:\test>test2 $VAR1 = { 'Paterman_S' => { 'email' => 'steve@net.net', 'first' => 'Steve', 'last' => 'Paterman', 'title' => 'Mr', 'phone' => '100', 'room' => '101' }, 'Caralo_L' => { 'email' => 'she@borg.org', 'first' => 'Linda', 'last' => 'Caralo', 'title' => 'Mrs', 'phone' => '148', 'room' => '201' }, 'Androno_J' => { 'email' => 'j@alo.com', 'first' => 'Jean', 'last' => 'Androno', 'title' => 'Miss', 'phone' => '167', 'room' => '317' } };` [download] Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "Think for yourself!" - Abigail Hooray!	[reply] [d/l] [select]
2Re: Is this a reasonable data structure? by jeffa (Bishop) on Oct 24, 2003 at 15:15 UTC
DBD::CSV++ However, you are doing too much work: `... # load address book entries my $sth = $dbh->prepare('SELECT * FROM addressbook'); $sth->execute(); my %data = map {$_->{last} => $_} @{$sth->fetchall_arrayref({})}; print Dumper \%data;` [download] jeffa L-LL-L--L-LL-L--L-LL-L-- -R--R-RR-R--R-RR-R--R-RR B--B--B--B--B--B--B--B-- H---H---H---H---H---H--- (the triplet paradiddle with high-hat)	[reply] [d/l]
Re: Re: Is this a reasonable data structure? by ndwg (Beadle) on Oct 24, 2003 at 18:38 UTC
Along the same lines, but a little easier if you don't know SQL, you can try using AnyData. -Nathan	[reply]
Re: Is this a reasonable data structure? by Art_XIV (Hermit) on Oct 23, 2003 at 20:56 UTC
What you are probably going to want is a hash of hashes, as long as you can count on the values for your top level of hash being unique, and especially if the values in the top level hash being something you will frequently use for lookups. In a hash of hashes you'd load and access your data in a pattern similar to: `$person->{'caralo_l'}{'title'} = "Mrs"; $person->{'androno_j'}{'room'} = "317"; $person->{'paterman_s'}{'last'} = "Paterman"; $person->{'paterman_s'}{'email'} = "stv@net.net";` [download] The main downside to an array of hashes is that it (the array) will have to be traversed to find specific entries in the hashes. Of course, the top-level hash in a hash of hashes will have to be traversed if the keys aren't being helpful. Check out Randall Schwartz's 'Learning Perl Objects, References and Modules' if you haven't done so for it's very lucid discussions of references and data structures.	[reply] [d/l]
Re: Re: Is this a reasonable data structure? by eric256 (Parson) on Oct 23, 2003 at 21:09 UTC
If you are going for speed of retrieval, then you could do an LoL and then have a hashtable to lookup indexes. Then you could have hashes that are keyed on firstname, lastname, a combination, or anything else. This can be a rather phone problem. The sick reason i enjoyed the data structures class :-) ___________ Eric Hodges	[reply]
Re: Is this a reasonable data structure? by Theo (Priest) on Oct 25, 2003 at 00:48 UTC
I was expecting a trickle of information, but find I'm overwhelmed by the deluge. I think I understand about 20% of what y'all have suggested. I'll be sitting down with the Llama and Camel as I read through your replies. There is so much to absorb here, understanding it will be a growth experience. Thank You all. -theo- (so many nodes and so little time ... ) Note: All opinions are untested, unless otherwise noted	[reply]
Re: Is this a reasonable data structure? by eric256 (Parson) on Oct 23, 2003 at 20:37 UTC
You could instead use an array of arrays. use strict; use warnings; use Data::Dumper; my $i = 0; my $line = <DATA>; chomp($line); my $colums = { # hash ref to hold colum numbers map { $_ => $i++ } # map each column to a hash an +d give it the index split(/\\|/,$line) # split it on the pipe and sen +d it to map }; print Dumper($colums); my $rows; foreach my $line (<DATA>) { chomp $line; push @$rows, [split(/\\|/, $line)]; } print "Record 1: first = " . $rows->[0][$colums->{first}]; __DATA__ title\|first\|last\|room\|phone\|email Mrs\|Linda\|Caralo\|201\|148\|she@borg.org Miss\|Jean\|Androno\|317\|167\|j@alo.com Mr\|Steve\|Paterman\|101\|100\|steve@net.net [download] That way you don't reproduce colum information. Prob a better way to do this but here was my whack at it. ___________ Eric Hodges	[reply] [d/l]
Re: Is this a reasonable data structure? by Anonymous Monk on Oct 24, 2003 at 12:25 UTC
"It seems to me kind of a waste to repeat the KEY information in each hash, but I haven’t thought up a better way." But that way it can go right into HTML::Template!	[reply]


Don't ask to ask, just ask
	PerlMonks