Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Making a hash of arrays using references

by Tmms (Initiate)
on Apr 08, 2017 at 22:03 UTC ( [id://1187495]=perlquestion: print w/replies, xml ) Need Help??

Tmms has asked for the wisdom of the Perl Monks concerning the following question:

Hello

I am fairly new at perl and I am struggling with references. I will post fictitious example underneath of what I would like to do. Hopefully someone can help me.

Intput: textfile with lines likes this:
Obi Wan "tab" Jedi "newline"
Yoda "tab" Jedi "newline"
Count Dooku "tab" Sith "newline"
...

Desired output:
A hash where the affiliations (Jedi of Sith) are the keys and the values are arrays of names (Obi Wan, Yoda, ...)
I want to create the hash while "iterating" over the file line by line.

Problem:
I got stuck with the construction of the hash. First I checked if the key (affiliation) was defined. If not, I added the appropriate key-value pair ($hash{affiliation} => [name])
But I could not find a way to add a name to the anonymous array when the affiliation was already a key.

Many thanks in advance.

Replies are listed 'Best First'.
Re: Making a hash of arrays using references
by kcott (Archbishop) on Apr 09, 2017 at 07:07 UTC

    G'day Tmms,

    Welcome to the Monastery.

    "I am fairly new at perl and I am struggling with references."

    Take a look at "perlreftut - Perl references intro" for general information on references; it contains links to more detailed information - follow as needed.

    For general information on data structures, see "perldsc - Perl data structures intro"; again, this has links to more detailed documentation.

    The type of data structure you're attempting to create is called a Hash of Arrays (typically abbreviated to HoA). If you look at "Generation of a HASH OF ARRAYS" (in perldsc), the last line of the example code is pretty much what you're looking for.

    So, working from just your example data, your code would look something like this:

    #!/usr/bin/env perl use strict; use warnings; use Data::Dump; my %hash; while (<DATA>) { chomp; my ($value, $key) = split /\t/; push @{$hash{$key}}, $value; } dd \%hash; __DATA__ Obi Wan Jedi Yoda Jedi Count Dooku Sith

    Output:

    { Jedi => ["Obi Wan", "Yoda"], Sith => ["Count Dooku"] }

    While that works fine with your (ideal) example data, real world data throws up all sorts of nasty challenges. Whenever you're working with comma- (tab-, pipe-, whatever-) separated values, I recommend you use the Text::CSV module. If you also have Text::CSV_XS installed, it will run faster. This module has addressed these "nasty challenges": this is not a wheel you need to reinvent. In this instance, the while loop only needs one statement.

    #!/usr/bin/env perl use strict; use warnings; use Text::CSV; use Data::Dump; my %hash; my $csv = Text::CSV::->new({sep_char => "\t"}); while (my $row = $csv->getline(\*DATA)) { push @{$hash{$row->[1]}}, $row->[0]; } dd \%hash; __DATA__ Obi Wan Jedi Yoda Jedi Count Dooku Sith

    Output (exactly the same as before):

    { Jedi => ["Obi Wan", "Yoda"], Sith => ["Count Dooku"] }

    See also: Data::Dump (which provides the dd function) or use Data::Dumper (which is a core module).

    — Ken

Re: Making a hash of arrays using references
by NetWallah (Canon) on Apr 09, 2017 at 06:23 UTC
    You will find this a frequent code pattern, young padewan.

    Here is a self-contained solution that demonstrates the idioms.

    This code uses "Autovivification" to avoid the issue you experienced - i.e. the same code handles both the case where the key is absent, and when present. Read all about it in perlref.

    use strict; use warnings; my %team; while (defined (my $line=<DATA>)){ chomp $line; # Zap the newline my ($name,$affiliation) = split /\t/,$line; next unless $affiliation; # Avoid empty lines push @{ $team{$affiliation} }, $name; } # Print the results for my $aff (sort keys %team){ print $aff, " :\t", join(", ", sort @{ $team{$aff} }) , "\n"; } __DATA__ Obi Wan Jedi Yoda Jedi Count Dooku Sith Darth Vader Sith Luke Skywalker Jedi
    OUTPUT:
    Jedi : Luke Skywalker, Obi Wan, Yoda Sith : Count Dooku, Darth Vader

            ...it is unhealthy to remain near things that are in the process of blowing up.     man page for WARP, by Larry Wall

Re: Making a hash of arrays using references (push)
by LanX (Saint) on Apr 08, 2017 at 22:10 UTC
    Welcome to the monastery Tmms,

    Looks like each time you are overwriting the hash entry with a new one element array.

    Please try to push instead

      push @{$hash{$affiliation}}, $name;

    Depending on the Perl version you'd be allowed to leave the @{...} out. °

    HTH :)

    Cheers Rolf
    (addicted to the Perl Programming Language and ☆☆☆☆ :)
    Je suis Charlie!

    °) please forget, because (strangely) "Starting with Perl 5.14, an experimental feature allowed push to take a scalar expression. This experiment has been deemed unsuccessful, and was removed as of Perl 5.24."

      Thank you for the replies. I will try the solutions and read the documents.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1187495]
Approved by Athanasius
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (6)
As of 2024-04-25 10:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found