assigning arrays as values to keys of hash

pearllearner315 has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: assigning arrays as values to keys of hash by davido (Cardinal) on Sep 18, 2018 at 05:20 UTC
In the canonical solution for finding unique elements a hash is employed since the keys are guaranteed to be unique. You simply need two hashes; one top-level (snake, bird), and one deeper level (scales, fangs, tail). But then once you've used that deeper level hash to remove the duplicate attributes, you can convert its keys to the contents of an anonymous array referred to by the top-level hash. In other words, you can replace the lower level hash with an array containing the keys that lower level hash once held. `#!/usr/bin/env perl use strict; use warnings; use Data::Dumper; my %hash; while (<DATA>) { chomp; my ($k, $v) = split /\s/; $hash{$k}{$v} = undef; } $hash{$_} = [keys %{$hash{$_}}] for keys %hash; print Dumper \%hash; __DATA__ bird beak bird beak bird claw bird wings bird feathers snake fangs snake scales snake fangs snake tail` [download] The output will be: `$VAR1 = { 'snake' => [ 'tail', 'fangs', 'scales' ], 'bird' => [ 'claw', 'wings', 'beak', 'feathers' ] };` [download] Another approach could be to track whether an attribute pair has been seen before in realtime during the while loop, rather than postprocessing the hash of hashes into a hash of arrays. To do this you could use a temporary `%attribseen` hash where the keys are some unique concatenation of the animal type and a given attribute of that animal. For example, 'bird' and 'beak' could be used to form a hash key of `bird\|beak`, and then you use that to assure uniqueness: `my %hash; { my %attribseen; while (<DATA>) { chomp; my ($k, $v) = split /\s/; push @{$hash{$k}}, $v unless $attribseen{"$k\|$v"}++; } } print Dumper \%hash; __DATA__ bird beak bird beak bird claw bird wings bird feathers snake fangs snake scales snake fangs snake tail` [download] The output will be the same as before. For some this may be simpler to look at. Even better (from a legibility standpoint) may be to separate out the uniqueness check into its own object, which can offer some internal state: #!/usr/bin/env perl package PairUnique; use strict; use warnings; sub new {return bless {}, shift} sub unique { my ($self, $k, $v) = @_; return !$self->{"$k=>$v"}++ ? $v : (); } package main; use strict; use warnings; use Data::Dumper; my %hash; { my $get = PairUnique->new; while (<DATA>) { chomp; my ($k, $v) = split /\s/; my $aref = $hash{$k} //= []; push @$aref, $get->unique($k,$v); } } print Dumper \%hash; __DATA__ bird beak bird beak bird claw bird wings bird feathers snake fangs snake scales snake fangs snake tail [download] This separates out the uniqueness logic, and keeps only the structure-building logic inside the while loop. More code means more to maintain and understand, but it's possible that intent will be clearer to the person reading the code. Dave	[reply] [d/l] [select]
Re: assigning arrays as values to keys of hash by jwkrahn (Abbot) on Sep 18, 2018 at 04:28 UTC
For unique values you probably want to use a hash instead of an array, something like this: `while ( <FILE> ) { my ( $key, $value ) = split; $hash{ $key }{ $value } = (); } print Dumper \%hash;` [download]	[reply] [d/l]
Re^2: assigning arrays as values to keys of hash by pearllearner315 (Acolyte) on Sep 18, 2018 at 04:36 UTC
I specifically need a hash of arrays.. any way that's possible?	[reply]
Re^3: assigning arrays as values to keys of hash by kevbot (Vicar) on Sep 18, 2018 at 05:20 UTC
Why do you need a hash of arrays? Is this question related to homework? As mentioned by jwkrahn, using a multi-level hash would allow you to readily avoid duplicates. I have the current code that works but it doesn't get rid of the duplicates The code you posted does not seem to work. When I ran your code, the hash keys contained an entire line of text and the values were undefined array references. When I removed the double quotes from the first argument to split, I was able to get a hash of array references. However, as you mentioned, there are duplicates in the array references. `#!/usr/bin/env perl use strict; use warnings; use Data::Dumper; my $file = 'file.txt'; open( FILE, '<', $file ) or die $!; my %hash; while ( <FILE> ) { chomp; my $lines = $_; my $key = (split(/ /, $lines))[0]; my $value = (split(/ /, $lines))[1]; push @{ $hash{$key} }, $value; } print Dumper(\%hash); exit;` [download] There are quite a few ways you could go about removing the duplicates. Here is one way to do it with help from the uniq function of List::Util. `#!/usr/bin/env perl use strict; use warnings; use Data::Dumper; use List::Util qw/uniq/; my $file = 'file.txt'; open( FILE, '<', $file ) or die $!; my %hash; while ( <FILE> ) { chomp; my $lines = $_; my ($key, $value) = split(/ /, $lines); push @{ $hash{$key} }, $value; } foreach my $key( keys %hash ){ my @array = @{$hash{$key}}; my @uniq_elems = uniq @array; $hash{$key} = \@uniq_elems; } print Dumper(\%hash); exit;` [download]	[reply] [d/l] [select]
Re^3: assigning arrays as values to keys of hash by AnomalousMonk (Archbishop) on Sep 18, 2018 at 06:01 UTC
push-ing each "organ" to an autovivified anonymous array keyed by its "animal" allows preservation of the original order of "organs" as found in the file (if this is of any importance). If preserving original order isn't important, use the simpler two-level hash approach described by others. c:\@Work\Perl\monks>perl -wMstrict -le "use autodie; no autodie qw(open close); ;; use List::MoreUtils qw(uniq); ;; use Data::Dump qw(dd); ;; my $file = qq{bird beak\n} . qq{bird beak\n} . qq{bird claw\n} . qq{bird wings\n} . qq{bird feathers\n} . qq{snake fangs\n} . qq{snake scales\n} . qq{snake fangs\n} . qq{snake tail\n} ; print qq{[[$file]]}; ;; open my $fh, '<', \$file or die qq{opening ram file: $!}; ;; my %hash; while (my $line = <$fh>) { my $parsed = my ($animal, $organ) = $line =~ m{ \A ([[:alpha:]]+) \s+ ([[:alpha:]]+) \Z }xmsg; ;; die qq{bad line '$line'} unless $parsed; ;; push @{ $hash{$animal} }, $organ; } ;; close $fh or die qq{closing ram file: $!}; ;; @$_ = uniq @$_ for values %hash; dd \%hash; " [[bird beak bird beak bird claw bird wings bird feathers snake fangs snake scales snake fangs snake tail ]] { bird => ["beak", "claw", "wings", "feathers"], snake => ["fangs", "scales", "tail"], } [download] Give a man a fish: `<%-{-{-{-<`	[reply] [d/l] [select]
Re^3: assigning arrays as values to keys of hash by jwkrahn (Abbot) on Sep 18, 2018 at 05:23 UTC
`my %unique; while ( <FILE> ) { my ( $key, $value ) = split; $hash{ $key }{ $value } = (); } $_ = [ keys %$_ ] for values %hash; print Dumper \%hash;` [download]	[reply] [d/l]
Re: assigning arrays as values to keys of hash by tybalt89 (Monsignor) on Sep 18, 2018 at 20:06 UTC
Why do it in two passes when it can be done in one pass? Efficiency is over-rated. (tybalt89 ducks :) `#!/usr/bin/perl # https://perlmonks.org/?node_id=1222551 use strict; use warnings; use Data::Dumper; my %hash; while( <DATA> ) { /(\S+)\s+(\S+)/ and $hash{$1} = [ keys %{{map {$_, 1} @{$hash{$1}}, +$2}} ]; } print Dumper \%hash; __DATA__ snake fangs snake tail snake fangs bird feathers bird beak snake scales bird beak bird claw bird wings` [download]	[reply] [d/l]
Re^2: assigning arrays as values to keys of hash by Marshall (Canon) on Sep 18, 2018 at 22:29 UTC
Hi tybalt89! Great post! However, I am not convinced that your implementation would be more efficient than any of the "2 pass solutions". I thought my response to the OP at Re: assigning arrays as values to keys of hash to be reasonable and importantly: understandable by the OP. I think that sometimes PerlMonks fails new Perler's with overly complicated solutions that they can't understand or generalize. This OP is a beginner, not by user name, but by his original code. Your solution hides a foreach loop in terms of a map{} which does a lot of work. Shorter Perl code doesn't always mean "faster".	[reply]
Re^3: assigning arrays as values to keys of hash by tybalt89 (Monsignor) on Sep 18, 2018 at 23:06 UTC
It seems my "Efficiency is over-rated" comment was unclear. I fully believe (without testing, therefor as an article of faith) that my solution is slower than the two pass solutions. I guess I didn't make that clear. I am less interested in efficiency and more interested in solutions that show more rarely used perl capabilities. TIMTOWTDI forever :)	[reply]
Re^4: assigning arrays as values to keys of hash by Marshall (Canon) on Sep 18, 2018 at 23:50 UTC
Re^3: assigning arrays as values to keys of hash by Anonymous Monk on Sep 18, 2018 at 22:45 UTC
Don't post bad code!	[reply]
Re^4: assigning arrays as values to keys of hash by tybalt89 (Monsignor) on Sep 18, 2018 at 23:36 UTC
Re^4: assigning arrays as values to keys of hash by Marshall (Canon) on Sep 18, 2018 at 23:33 UTC
Re: assigning arrays as values to keys of hash by BillKSmith (Monsignor) on Sep 18, 2018 at 14:33 UTC
It is possible to do exactly what you asked for by explicitly testing for duplicates (use the function 'none' from List::Util) before storing. Because it uses arrays, it does preserve the order of the features. This solution is probably the slowest running of all the suggestion you received. ?type pearllearner315.pl #!/usr/bin/perl use strict; use warnings; use List::Util qw(none); use Data::Dumper; my $file = \<<'EOF'; bird beak bird beak bird claw bird wings bird feathers snake fangs snake scales snake fangs snake tail EOF #my $file = 'file.txt'; open( FILE, '<', $file ) or die $!; my %hash; while ( <FILE> ) { chomp; my $lines = $_; my ($key, $value) = split(/\s+/, $lines, 2); push @{ $hash{$key} }, $value if none {$value eq $_} @{$hash{$key} +}; } print Dumper(\%hash); ?perl pearllearner315.pl $VAR1 = { 'snake' => [ 'fangs', 'scales', 'tail' ], 'bird' => [ 'beak', 'claw', 'wings', 'feathers' ] }; ? [download] Bill	[reply] [d/l]
Re: assigning arrays as values to keys of hash by Marshall (Canon) on Sep 18, 2018 at 19:00 UTC
I guess while we are beating this thing to death... I find this pretty easy to read... List::Util is great module. `#!/usr/bin/env perl use strict; use warnings; use Data::Dumper; use List::Util qw(uniq); my %hash; while (<DATA>) { my ($animal, $part) = split /\s+/; push @{$hash{$animal}},$part; } @{$hash{$_}} = uniq @{$hash{$_}} for keys %hash; print Dumper \%hash; __DATA__ bird beak bird beak bird claw bird wings bird feathers snake fangs snake scales snake fangs snake tail` [download]	[reply] [d/l]
Re^2: assigning arrays as values to keys of hash by AnomalousMonk (Archbishop) on Sep 18, 2018 at 23:17 UTC
`@{$hash{$_}} = uniq @{$hash{$_}} for keys %hash;` I think `@$_ = uniq @$_ for values %hash;` is more concise and even easier to read :) Give a man a fish: `<%-{-{-{-<`	[reply] [d/l] [select]


No such thing as a small change
	PerlMonks