Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Merging multidimensional hashes from forks to parent hash

by Speed_Freak (Sexton)
on Jul 12, 2018 at 17:35 UTC ( [id://1218395]=perlquestion: print w/replies, xml ) Need Help??

Speed_Freak has asked for the wisdom of the Perl Monks concerning the following question:

I've been piecing this together for a while, but I'm getting stuck again. Does anyone have an idea how to merge a hash with itself?

I'm probably phrasing that wrong, but I have used Paralell:ForkManager to fork an if loop to utilize 40 threads. That if loop then creates a nested hash structure that needs to be returned. I use nfreeze from Storable to serialize the data, then I return that reference to the parent. I then dereference and thaw.

#dumper of thawed data $VAR44123 = '511879'; $VAR44124 = { 'file3232.ext' => { 'e' => 0, 'k' => '0.0031', 'i' => '26.9', 'j' => '33.0' } }; $VAR44125 = '739569'; $VAR44126 = { 'file3232.ext' => { 'e' => 0, 'k' => '0.1040', 'i' => '26.7', 'j' => '14.6' } };

Before the forking happens, I define the parent hash as %parent = ();
And I need to push each data structure into that parent hash, but I need them to merge.
The ending data structure will have the leftmost structure staying, but adding additional files to the middle structure, and corresponding datasets from the right side for each of those added files.

The problem I'm having is that I can't figure out how to use Hash::Merge to merge the parent hash with each returned hash from the forks.
I was thinking something like %parent = merge( \%parent, \%returned); but that currently gives me errors about having Reference found where even-sized list expected... but in one of my earlier iterations I didn't have errors, but that merge caused the script to go to one thread and hang until it ran out of memory.

Am I barking up the right tree here? Or should I be using something else to merge a structure like this?

cliff notes: I need to start with an empty hash, and then return multidimensional hashes from forks and compile them into the parent hash as each fork returns.

Replies are listed 'Best First'.
Re: Merging multidimensional hashes from forks to parent hash
by choroba (Cardinal) on Jul 12, 2018 at 18:40 UTC
    The problem I'm having is I don't understand what output you expect for the given input sample.

    Do you want to add the numbers with the same keys?

    #!/usr/bin/perl use warnings; use strict; use Data::Dumper; my %parent = ( 'file3232.ext' => { e => 0, k => '0.0031', i => '26.9', j => '33.0', } ); my %returned = ( 'file3232.ext' => { e => 0, k => '0.1040', i => '26.7', j => '14.6', } ); for my $file (keys %returned) { for my $dataset (keys %{ $returned{$file} }) { $parent{$file}{$dataset} += $returned{$file}{$dataset}; } } print Dumper \%parent;

    Using Hash::Merge is the following: retrieve the most similar predefined behaviour, modify it to your needs (e.g. define the addition of scalar values) and use it to merge the hashes:

    use Hash::Merge qw{ merge }; my $behavior = Hash::Merge::get_behavior_spec('STORAGE_PRECEDENT'); $behavior->{SCALAR}{SCALAR} = sub { $_[0] + $_[1] }; Hash::Merge::add_behavior_spec($behavior); %parent = %{ merge(\%parent, \%returned) }; print Dumper \%parent;

    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
Re: Merging multidimensional hashes from forks to parent hash
by atcroft (Abbot) on Jul 12, 2018 at 20:01 UTC

    I believe I understood what you were trying to do, but as I had not looked at Hash::Merge, I guess I went about things the long way. The sample code below ran a maximum of 65 children at a rate of a maximum of 5 simultaneous children. To simulate a workload, each generates a "filename" and picks 4 keys from a set of 8 possible, and generates a value for each key. When returned to the parent, the data is merged into a single HoH (level 1: filename; level 2: key), with a "total" of the keys is also generated. The data is then dumped at the end of the script. (Sample data run follows in <readmore></readmore>.)

    #!/usr/bin/env perl use strict; use warnings; use Data::Dumper; use Getopt::Long; use Parallel::ForkManager; $Data::Dumper::Deepcopy = 1; $Data::Dumper::Sortkeys = 1; $| = 1; my $max_child = 5; my %parent_hash = (); my $pm = Parallel::ForkManager->new($max_child); $pm->run_on_finish( sub { my ( $pid, $exit_code, $process_ident, $exit_signal, $core_dump, $ds_ref, ) = @_; if ( defined $ds_ref ) { foreach my $k1 ( sort { $a cmp $b } keys %{$ds_ref} ) { foreach my $k2 ( keys %{ $ds_ref->{$k1} } ) { $parent_hash{$k1}{$k2} += $ds_ref->{$k1}{$k2}; $parent_hash{total}{$k2} += $ds_ref->{$k1}{$k2}; } } } }, ); foreach ( my $i = 0 ; $i < 128 ; $i += 8 ) { $pm->start and next; sleep $i; srand(); # So children have different random seeds my $child_hash; foreach my $j ( 0 .. int( rand() * 5 + 1 ) ) { my $fn = sprintf qq{file%05d.dat}, $i + $j; my @letters = ( 'a' .. 'h', ); foreach my $k ( 0 .. 3 ) { my $l; while ( $l = int( rand() * scalar @letters ) ) { last unless ( exists $child_hash->{$fn}{ $letters[$l] } ); } my $m = int( rand() * 20 ); $child_hash->{$fn}{ $letters[$l] } = $m; } } print Data::Dumper->Dump( [ \$i, \$child_hash, ], [qw( *i *child_hash )] ), qq{\n}; $pm->finish( 0, $child_hash, ); } $pm->wait_all_children; print Data::Dumper->Dump( [ \%parent_hash, ], [qw( *parent_hash )] ), qq{\n};

    Test run output:

    Hope that helps.

Re: Merging multidimensional hashes from forks to parent hash
by tybalt89 (Monsignor) on Jul 12, 2018 at 18:01 UTC
A reply falls below the community's threshold of quality. You may see it by logging in.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1218395]
Approved by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (6)
As of 2024-04-19 11:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found