Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Array of Hashes to Hash of arrays for SQL::Abstract

by Skeeve (Parson)
on Mar 18, 2020 at 09:16 UTC ( [id://11114419]=perlquestion: print w/replies, xml ) Need Help??

Skeeve has asked for the wisdom of the Perl Monks concerning the following question:

I just discovered SQL::Abstract because it's used by Mojo::PG

In SQL::Abstract you can write an insert command like this:

my $sql= new SQL::Abstract( array_datatypes => 1 ); my $required = { a => [1, 3, 5], b => [2, 4, 6],}; my($stmt, @bind)= $sql->insert( testtable => $required );

This will be created:

stmt: "INSERT INTO testtable ( a, b) VALUES ( ?, ? )" @bind: ( [ 1, 3, 5 ], [ 2, 4, 6 ] );

I receive JSON data in my script and so my data is like this:

my $data = [ {a => 1, b => 2}, {a => 3, b => 4}, {a => 5, b => 6}, ];

I'm wondering whether or not there is something which would give me the required hash of arrays from my existing array of hashes.

My lengthy attempt is like this:

# %columns will become the hash of arrays my %columns; # Foreach row of data collect its keys and create an empty array in %c +olumns $columns{$_} //= [] foreach map { keys %$_ } @$data; # Then, for each row foreach my $row ( @$data ) { # push the value foreach column push @{$columns{$_}}, $row->{$_} foreach keys %columns; }

Is there a better way to achieve this transposition?


s$$([},&%#}/&/]+}%&{})*;#$&&s&&$^X.($'^"%]=\&(|?*{%
+.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e

Replies are listed 'Best First'.
Re: Array of Hashes to Hash of arrays for SQL::Abstract
by hippo (Bishop) on Mar 18, 2020 at 09:48 UTC

    In most cases "better" is likely to be subjective. However, I would probably try to loop over $data just once:

    use strict; use warnings; use Test::More tests => 1; my $data = [ {a => 1, b => 2}, {a => 3, b => 4}, {a => 5, b => 6} ]; my $want = { a => [1, 3, 5], b => [2, 4, 6] }; my %columns; for my $row (@$data) { for my $k (keys %$row) { $columns{$k} //= []; push @{$columns{$k}}, $row->{$k}; } } is_deeply \%columns, $want;
      However, I would probably try to loop over $data just once

      The disadvantage here is, that you might loose null-values due to missing keys in your data

      my $data = [ {a => 1, b => 2, c => 11}, {a => 3, c => 12}, {a => 5, b => 6} ];

      s$$([},&%#}/&/]+}%&{})*;#$&&s&&$^X.($'^"%]=\&(|?*{%
      +.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e

        I think you are right, in such circumstance I think I would use an index:

        Something like:

        for my $i (0..$#{$data}) { foreach ( keys %{$data->[$i]} ) { $columns->{$_}->[$i] = $data->[$i]->{$_} ; } ; }

        Also makes it fairly easy to see what happens:

        $columns->{$_}->[$i] = $data->[$i]->{$_} ; hash------^-------------------------^ array-----------^-------------^
        The disadvantage here is, that you might loose null-values due to missing keys in your data

        ... which you didn't mention at all in the root node. Changing the specs on us is not nice :-P Is this example you've shown now representative, or are there other edge cases?

Re: Array of Hashes to Hash of arrays for SQL::Abstract
by BillKSmith (Monsignor) on Mar 18, 2020 at 17:48 UTC
    You have not specified exactly what is required for your 'missing keys' test case. There is a very subtle difference between $data_hash->{c}->[2] = undef and $data)hash->{c}->[2] does not exist. I assumed that it should be undef. This requires setting the length of the arrays. The following test passes. Try commenting out the statement with '# Set length'. Observe how the test fails.
    use strict; use warnings; use Test::More tests=>1; my $data = [ {a => 1, b => 2, c=>11}, {a => 3, c=>12}, {a => 5, b => 6}, ]; my $required = { a => [1, 3, 5 ], b => [2, undef, 6 ], c =>[11, 12, undef], }; my $data_hash; foreach my $i (0..$#$data) { while ((my $key, my $value)= each %{$data->[$i]}) { $#{$data_hash->{$key}} = $#$data; # Set length $data_hash->{$key}->[$i] = $value; } } is_deeply($data_hash, $required, 'missing keys');

    Of course, if you require c => [11, 12] then the "set length" statement must be omitted.

    Bill
Re: Array of Hashes to Hash of arrays for SQL::Abstract
by Marshall (Canon) on Mar 18, 2020 at 22:07 UTC
    Without any missing keys, the transformation is fairly straightforward. You do not need to create an empty array first.
    use warnings; use strict; use Data::Dump qw(pp); my $data = [ {a => 1, b => 2}, {a => 3, b => 4}, {a => 5, b => 6}, ]; # create data struct like this # # my $required = { a => [1, 3, 5], b => [2, 4, 6],}; my %required; foreach my $href (@$data) { push @{$required{$_}}, $href->{$_} foreach keys %$href; } pp \%required; #{ a => [1, 3, 5], b => [2, 4, 6] }
    I don't see the requirement for missing keys in the OP? If that is true, then I would probably make 2 passes over $data, the first to discover the "universe of keys" so that appropriate null or undef values can be filled into the arrays as they are generated. There is also the question of what kind of DB you are using and what if any "default" values are specified during table creation? It could also be that SQL::Abstract is not the best way to do what you want?

    Update: Some code to implement the above.

    ### assume that some keys could be missing or others added my $data2 = [ {a => 1, }, {a => 1, b => 2}, {a => 3, b => 3, c=>7}, {a => 5, b => 6, c=>8}, {a => 5, c=>8, d=>9}, ]; my %key_universe_h; my $default =0; # determine all possible keys (number of columns) foreach my $href (@$data2) { $key_universe_h{$_}++ foreach keys %$href; } my @key_universe_a = sort (keys %key_universe_h); # proceed as in first example, but with default # for non-existant key values my %required2; foreach my $href (@$data2) { foreach my $column (@key_universe_a) { my $value = $default; $value = $href->{$column} if exists $href->{$column}; push @{$required2{$column}}, $value; } } pp $data2; pp \%required2;
    Prints:
    [ { a => 1 }, { a => 1, b => 2 }, { a => 3, b => 3, c => 7 }, { a => 5, b => 6, c => 8 }, { a => 5, c => 8, d => 9 }, ] { a => [1, 1, 3, 5, 5], b => [0, 2, 3, 6, 0], c => [0, 0, 7, 8, 8], d => [0, 0, 0, 0, 9], }
Re: Array of Hashes to Hash of arrays for SQL::Abstract
by kcott (Archbishop) on Mar 19, 2020 at 07:46 UTC

    G'day Skeeve,

    "Is there a better way to achieve this transposition?"

    The following code generates a list of unique keys once and doesn't require initialisation of %columns. Note that the source data I've used contains existent and non-existent keys along with defined and undefined values; for future reference, please consider providing more representative data in your OP.

    #!/usr/bin/env perl use strict; use warnings; use Data::Dump; my $data = [ { a => 1 }, { a => 3, b => 4 }, { }, { b => 8 }, { a => undef, c => 9 }, ]; my (%columns, %seen_key); my @uniq_keys = grep !$seen_key{$_}++, map keys(%$_), @$data; for my $row (@$data) { push @{$columns{$_}}, $row->{$_} for @uniq_keys; } dd \%columns;

    That outputs:

    { a => [1, 3, undef, undef, undef], b => [undef, 4, undef, 8, undef], c => [undef, undef, undef, undef, 9], }

    If you want something other than undef to represent your null values — a zero-length string for instance — you can change the push to this if you have Perl 5.10 or later:

    push @{$columns{$_}}, $row->{$_} // '' for @uniq_keys;

    For older versions of Perl, you can use this:

    push @{$columns{$_}}, defined($row->{$_}) ? $row->{$_} : '' for @uniq_ +keys;

    Both of those produce identical output:

    { a => [1, 3, "", "", ""], b => ["", 4, "", 8, ""], c => ["", "", "", "", 9], }

    As a side note, code like this:

    my $sql= new SQL::Abstract( array_datatypes => 1 );

    is discouraged and it is recommended that it be avoided. See "perlobj: Indirect Object Syntax" for details, paying particular attention to the opening, emboldened text:

    "Outside of the file handle case, use of this syntax is discouraged as it can confuse the Perl interpreter. See below for more details."

    A better way to write that statement would be:

    my $sql = SQL::Abstract::->new(array_datatypes => 1);

    — Ken

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11114419]
Approved by marto
Front-paged by haukex
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (4)
As of 2024-04-20 00:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found