Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Massive expansion of a hash of arrays?

by Amblikai (Scribe)
on Jul 17, 2014 at 20:42 UTC ( [id://1094124]=perlquestion: print w/replies, xml ) Need Help??

Amblikai has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks! I've got a bit of a question which i'm hoping could make one of my scripts a bit more concise

Essentially i have a hash of arrays (massively simplified):

my %hash=('ID' => { 'key1' => [key1_val1, key1_val2], 'key2' => [key2_val1, key2_val2] } );

And i need to expand it out to look like this:

ID, 1, key1_val1, key2_val1 ID, 2, key1_val1, key2_val2 ID, 3, key1_val2, key2_val1 ID, 4, key1_val2, key2_val2

So each key is a field of data With various values and i need to expand out a single line for each combination given. As i said, this i massively simplified and i actually have ~15 'keys' and each can have about 11-12 values in it's array.

I'm currently doing it the obvious way of:

my $id2=1; my %data=(); foreach my $id (keys(%hash)) { foreach my $key1_val (@{$hash{$id}{'key1'}}) { foreach my $key2_val (@{$hash{$id}{'key2'}}) { $id2++; $data{$id}{$id2}{'key1'}=$key1_val; $data{$id}{$id2}{'key2'}=$key2_val; } } }

Which is fine but everytime i add a new field it get a bit unwieldy. I have "foreach" statements running off the side of the monitor and onto my wall!! Help!

Apologies if there's any mistakes in the above. It's been a long day. Any help appreciated as ever!

Replies are listed 'Best First'.
Re: Massive expansion of a hash of arrays?
by Anonymous Monk on Jul 17, 2014 at 22:53 UTC
    If I understand you correctly, you want to produce all possible combinations of elements from your arrays. Maybe it would be easier to generate indexes for arrays separately. It's like an odometer:
    0 0 0 1 0 0 2 0 0 3 0 0 0 1 0 0 2 0 0 3 0 0 0 1 1 0 1 2 0 1 3 0 1 0 1 1 0 2 1 (etc)
    So, your odometer looks like this (at first):
    my @odometer = (0, 0, 0);
    And you use it like this:
    while_odometer_has_not_run_to_completion... { $id2++; ( $data{$id}{$id2}{'key1'}, $data{$id}{$id2}{'key2'}, $data{$id}{$id2}{'key3'}, ) = ( $hash{$id}{'key1'}[ $odometer[0] ], $hash{$id}{'key2'}[ $odometer[1] ], $hash{$id}{'key3'}[ $odometer[2] ], ); next_odometer( \@odometer ); }
    ...or something like that...
Re: Massive expansion of a hash of arrays?
by sundialsvc4 (Abbot) on Jul 17, 2014 at 22:53 UTC

    If the data-structure being traversed is deep and/or arbitrary, perhaps a tool such as Data::Walker will come in handy.   (There are many tools of this sort on CPAN ...)

    These tools, as the name implies, allow you to “walk through” a data structure in a predictable way, regardless of its depth.   Might or might not be apropos in this particular situation, but worth knowing about anyhow.

      Thanks im reading up on it now. Meanwhile, i'd like to understand recursion better as this is the first ive come across it and it looks immensely useful for the future.

      Conceptually it seems simple enough, a function calling itself but i cant get my head around the details of what the code is doing!

        I recommend you start with the Wikipedia article titled Recursion (computer science). And because "recursion is one of the central ideas of computer science," you're going to find explanations and examples of it in any good computer science textbook.

        If you have an arbitrarily nested data structure with both arrays and hashes in it, then you're going to need to use the ref function. You'll use an if-then-else construct within your recursive function to decide what to do at each level in the nested data structure.

Re: Massive expansion of a hash of arrays?
by djerius (Beadle) on Jul 18, 2014 at 21:10 UTC
    Update: Whoops! Looks like Christoforo suggested this first. Not sure why that post's code is all struck out though.

    Set::CrossProduct to the rescue:

    use strict; use warnings; use Set::CrossProduct; my %hash=('ID' => { 'key1' => [ qw/ key1_val1 key1_val2/ ], 'key2' => [ qw/ key2_val1 key2_val2/ ] } ); foreach my $id ( keys %hash ) { my $keys = $hash{$id}; my $set = Set::CrossProduct->new( [ values %$keys ] ); print join( ', ', $id, @$_ ), "\n" while $_ = $set->get; }
    Results in
    ID, key2_val1, key1_val1 ID, key2_val1, key1_val2 ID, key2_val2, key1_val1 ID, key2_val2, key1_val2
    Not exactly the order of your requested output, but that's because your keys are in a hash.
Re: Massive expansion of a hash of arrays?
by Cristoforo (Curate) on Jul 17, 2014 at 23:36 UTC
    Set::CrossProduct provides a solution.
    #!/usr/bin/perl use strict; use warnings; use Set::CrossProduct; my %hash=('ID' => { 'key1' => ['key1_val1', 'key1_val2'], 'key2' => ['key2_val1', 'key2_val2'] } ); for my $id (keys %hash) { my @data = values %{ $hash{$id} }; my $cp = Set::CrossProduct->new( \@data ); my $i = 1; while( my $array_ref = $cp->get ) { print join( " ", $id, $i++, @$array_ref ), "\n"; } }
    This prints
    ID 1 key2_val1 key1_val1 ID 2 key2_val1 key1_val2 ID 3 key2_val2 key1_val1 ID 4 key2_val2 key1_val2
    Note that my @data = values %{ $hash{$id} }; does not give values corresponding to 'key1', 'key2' ... 'key15' order. Some change in how the order of values appear would need to be made to get that.

    Hope this helps,

    Chris

    Update: You will get huge amount of combinations for 12 rows with 15 items in each array. 15 ^ 12 = 129,746,337,890,625

      Yes i think im over estimating my data set. Its more like 15 keys but each key will have 2-3 values. Only one of them has 12 values.

      how would CrossProduct deal with arrays that you didnt want to expand? One of the end values might be an array. I would want to copy that as is<\p>

        Something like this would prevent one of the arrays from being processed in the cross product.
        #!/usr/bin/perl use strict; use warnings; use Set::CrossProduct; my %hash=('ID' => { 'key1' => ['key1_val1', 'key1_val2'], 'key2' => ['key2_val1', 'key2_val2'], 'key3' => ['one', 'two'] } ); for my $id (keys %hash) { my $key3 = delete $hash{$id}{key3}; my @data = values %{ $hash{$id} }; my $cp = Set::CrossProduct->new( \@data ); my $i = 1; while( my $array_ref = $cp->get ) { print join( " ", $id, $i++, @$array_ref ), "\n"; } print "@$key3\n"; }
        Prints
        ID 1 key2_val1 key1_val1 ID 2 key2_val1 key1_val2 ID 3 key2_val2 key1_val1 ID 4 key2_val2 key1_val2 one two
Re: Massive expansion of a hash of arrays?
by Anonymous Monk on Jul 17, 2014 at 20:51 UTC

    Too many nested loops? Sounds like you need recursion!

      Ok, i've come across this example of recursion from another user on this forum BrowserUk But i'm having trouble understanding it. Could anyone help?

      Re^3: Variable number of foreach loops

      The code is below:

      #! perl -slw use strict; sub nForX(&@) { my $code = shift; my $n = shift; return $code->( @_ ) unless $n; for my $i ( @{ shift() } ) { &nForX( $code, $n-1, @_, $i ); } } my @a = 1..10; my @b = 'a'..'z'; my @c = map chr, 33 .. 47; nForX { print join ' ', @_; } 3, \( @a, @b, @c );

      Anyone? Thanks.

        OK, I’ll have a go at explaining how this works. But note first that BrowserUk produced his elegant solution by exploiting some of the less-well-known features of Perl syntax. Let’s get them out of the way first:

        Now to the recursion. (1) On the first call, $code is initialised to the block { print join ' ', @_; }, and $n is set to 3. As 3 is non-zero, the call to return $code->(@_) is skipped. The for loop which follows is equivalent to this:

        for my $i (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) { &nForX($code, 2, \@b, \@c, $i); }

        The first iteration of this loop calls nForX for the second time, as follows:

        &nForX($code, 2, \@b, \@c, 1);

        (2) Within this second call, $code is set as before, and $n is 2. The for loop is now equivalent to this:

        for my $i ('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', + 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z' +) { &nForX($code, 1, \@c, 1, $i); }

        On its first iteration, it calls nForX for the third time, as follows:

        &nForX($code, 1, \@c, 1, 'a');

        (3) Within this third call, $n is 1, and the loop is now this:

        for my $i ('!', '"', '#', '$', '%', '&', "'", '(', ')', '*', '+', ',', + '-', '.', '/') { &nForX($code, 0, 1, 'a', $i); }

        On its first iteration, this loop calls nForX for the fourth time, as follows:

        &nForX($code, 0, 1, 'a', '!');

        (4) Within this fourth call, $n is now zero, so the sub ends with the statement return $code->(@_);, which is here equivalent to:

        return print join ' ', 1, 'a', '!';

        So at this point, the first line of output is printed, and a “true” value is returned to the caller, which was the third call to nForX. That third call throws the return value away, and proceeds to the next iteration of its for loop, which is equivalent to this:

        &nForX($code, 0, 1, 'a', '"');

        — and so on and so on, until all the calls to nForX have returned and all the loops are exhausted. (As am I, after all that!)

        Hope that helps,

        Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

      Any tips on where to start with that? I'm googling recursion but it looks massively complicated.

Re: Massive expansion of a hash of arrays?
by Sjakie (Initiate) on Jul 18, 2014 at 19:50 UTC
    Hi - below similar approach used by yourself... key difference is that it handles the different 'levels' in which data is structured; ie. 'ID' (hash) - 'key1/key2' (hash-of-hash) - 'keyX-valY' (hash-of-hash-of-array).
    my %hash=('ID' => { 'key1' => [ "key1_val1", "key1_val2" ], 'key2' => [ "key2_val1", "key2_val2" ] } ); foreach my $id (keys %hash ) { foreach my $keyno ( sort keys %{$hash{$id}} ) { for (my $keyvalno = 0; $keyvalno <= $#{$hash{$id}{$keyno}}; $k +eyvalno++) { print "$id - $keyno - $hash{$id}{$keyno}[$keyvalno]\n"; } } }

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1094124]
Approved by AppleFritter
Front-paged by Jim
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (3)
As of 2024-04-19 21:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found