http://qs321.pair.com?node_id=624248

hacker has asked for the wisdom of the Perl Monks concerning the following question:

Thanks to the patient mentoring of Anno, bart, and clinton, I'm making this post to embiggen my Perl skills.

I've got a project that I'm building to mirror an enormous amount of data in several dozen languages. Right now I have 3 separate arrays:

my @langs = ('en', 'de', 'fr', 'it', 'gr', '...'); my @projects = ('dogs', 'cats', 'birds', 'horses'); my @targets = ('images', 'data', 'links', 'other');
In each of these languages, I build a structure that looks like:
endogs, encats, enbirds, enhorses dedogs, decats, debirds, dehorses frdogs, frcats, frbirds, frhorses itdogs, itcats, itbirds, ithorses grdogs, grcats, grbirds, grhorses
Within each of those projects, I fetch the data for them:
endogs/images.tar.bz2 endogs/data.tar.bz2 endogs/links.tar.bz2 endogs/other.tar.bz2 dedogs/images.tar.bz2 dedogs/data.tar.bz2 dedogs/links.tar.bz2 dedogs/other.tar.bz2 [...] encats/images.tar.bz2 encats/data.tar.bz2 encats/links.tar.bz2 encats/other.tar.bz2 decats/images.tar.bz2 decats/data.tar.bz2 decats/links.tar.bz2 decats/other.tar.bz2
The way I'm doing this is very "array-based":
foreach my $lang (@langs) { foreach my $project (@projects) { mkpath ("$project/$lang"); foreach my $target (@targets) { my $backup = $backup_file; my $output = $output_save_file; print "Mirroring $project ($lang) now...\n"; # Other stuff happens here } } }

I've always had trouble grokking hashes in Perl, many people know that... but it looks like I have to bite the bullet here and dive into an HoHoA to create this structure.

My question is... can I create a "completely anonymous hash", which can be built dynamically from the values in each of the arrays above? I'd actually like to add more granular detail here, without adding more and more arrays (further increasing that nesting)... something like:

'it' => 'Italian' { ... }, 'es' => 'Spanish' { ... }, ...

The goal is to be able to fetch each of the @targets within each of the @projects, for each @lang supported. If I want to back up another set of targets in another language, I should just be able to add another language to the list and have it inherit the rest of the members of the "anonymous hash" (if I'm using the vernacular correctly).

Is this possible? Is there another way to represent this structure without more levels of array nesting, and without duplicating manually-typed entries in a growing sub?

Replies are listed 'Best First'.
Re: Dating a Structure
by punch_card_don (Curate) on Jul 01, 2007 at 04:21 UTC
    Do you mean like this?
    foreach my $lang (@langs) { foreach my $project (@projects) { foreach my $target (@targets) { my $dynamically_growing_hash{$lang}{$project}{$target} = somethi +ng; my $dynamically_growing_hash{$lang}{$project}{$target}{'sub_char +acteristic_1'} = something_else; } } }
      Oops, you can't lexicalize a hash element.
      syntax error at _FILE__ line __LINE__, near "$dynamically_growing_hash +{" Execution of __FILE__ aborted due to compilation errors.

      Open source softwares? Share and enjoy. Make profit from them if you can. Yet, share and enjoy!

Re: Dating a Structure
by toma (Vicar) on Jul 01, 2007 at 19:01 UTC
    I think a good reference for this is the book 'Intermediate Perl'. Here is some code without using any objects:
    use Data::Dumper; my %data; my @projects= qw ( dogs cats birds horses ); my $langs= { it => 'Italian', es => 'Spanish', en => 'English' }; my @targets= qw ( images data links other ); foreach my $project (@projects) { foreach my $lang_abbrev (keys %$langs) { foreach my $target (@targets) { $data{$project}{$lang_abbrev}{$target}= $lang_abbrev.$project.'/'.$target.'.tar.bz2'; } } } print Dumper(\%data);
    This is a practical way to write code, but if you have a large project you end up with some very large subs and a very large file full of code. This becomes difficult to work on. One technique for breaking up the code a little bit is to use an object:
    use Data::Dumper; my $data= new BigData(); print Dumper($data); package BigData; sub new { my $class= shift; my $data= {}; my @projects= qw ( dogs cats birds horses ); my $langs= { it => 'Italian', es => 'Spanish', en => 'English' }; my @targets= qw ( images data links other ); foreach my $project (@projects) { foreach my $lang_abbrev (keys %$langs) { foreach my $target (@targets) { $data->{$project}{$lang_abbrev}{$target}= $lang_abbrev.$project.'/'.$target.'.tar.bz2'; } } } bless $data, $class; return $data; } 1;
    So far the object hasn't helped very much. Having one large object for your whole program doesn't do much good. Better to build the big object out of smaller objects:
    use Data::Dumper; my $data= new BigData(); print Dumper($data); package BigData; sub new { my $class= shift; my $data= {}; my @projects= qw ( dogs cats birds horses ); my @langs; push @langs, new Lang( { abbrev => 'it', name => 'Italian' } ); push @langs, new Lang( { abbrev => 'es', name => 'Spanish' } ); push @langs, new Lang( { abbrev => 'en', name => 'English' } ); my @targets= qw ( images data links other ); foreach my $project (@projects) { foreach my $lang (@langs) { foreach my $target (@targets) { $data->{$project}{$lang->get_abbrev()}{$target}= $lang->get_abbrev().$project.'/'. $target.'.tar.bz2'; } } } bless $data, $class; return $data; } 1; package Lang; sub new { my $class= shift; my $lang= {}; my ($props)= @_; foreach my $key (keys %$props) { $lang->{$key}= $props->{$key}; } bless $lang, $class; return $lang; } sub get_abbrev { my $lang= shift; return $lang->{abbrev}; } 1;
    This type of code also gets complicated. You would probably end up making more objects. Each object gets its own module in its own file. If you do it right, that makes it easier to maintain. The key is to keep your subroutines and files from getting too large.

    Here are some tips:

    1. Whenever you have something that looks like a large case statement, you can somehow create a class to replace it.
    2. Translate the insides of deeply nested loops into subs.
    3. It isn't always obvious how to structure your code or your packages. Leave time for trying different approaches.
    4. Try using modules from cpan instead of writing your own code. For example maybe Locale::Language would be useful.
    It should work perfectly the first time! - toma
Re: Dating a Structure
by FunkyMonk (Chancellor) on Jul 01, 2007 at 22:08 UTC
    I've always had trouble grokking hashes in Perl
    Think of them as arrays that can take any text as an index, not just a number.

    Have you read through the Perl Data Structures Cookbook?

re: Embiggen
by bibliophile (Prior) on Jul 01, 2007 at 16:11 UTC
    ++ to the OP for "embiggen" :-)