Array of Hashes to Hash of arrays for SQL::Abstract

Skeeve has asked for the wisdom of the Perl Monks concerning the following question:

I just discovered SQL::Abstract because it's used by Mojo::PG

In SQL::Abstract you can write an insert command like this:

my $sql= new SQL::Abstract( array_datatypes => 1 );

my $required = {  a => [1, 3, 5],  b => [2, 4, 6],};

my($stmt, @bind)= $sql->insert( testtable => $required );
[download]

This will be created:

stmt:  "INSERT INTO testtable ( a, b) VALUES ( ?, ? )"
@bind: (
          [
            1,
            3,
            5
          ],
          [
            2,
            4,
            6
          ]
        );
[download]

I receive JSON data in my script and so my data is like this:

my $data = [
  {a => 1, b => 2},
  {a => 3, b => 4},
  {a => 5, b => 6},
];
[download]

I'm wondering whether or not there is something which would give me the required hash of arrays from my existing array of hashes.

My lengthy attempt is like this:

# %columns will become the hash of arrays
my %columns;

# Foreach row of data collect its keys and create an empty array in %c
+olumns
$columns{$_} //= [] foreach map { keys %$_ } @$data;

# Then, for each row
foreach my $row ( @$data ) {
  # push the value foreach column
  push @{$columns{$_}}, $row->{$_} foreach keys %columns;
}
[download]

Is there a better way to achieve this transposition?

s$$([},&%#}/&/]+}%&{})*;#$&&s&&$^X.($'^"%]=\&(|?*{%
+.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e

Comment on Array of Hashes to Hash of arrays for SQL::Abstract Select or Download Code

Replies are listed 'Best First'.
Re: Array of Hashes to Hash of arrays for SQL::Abstract by hippo (Bishop) on Mar 18, 2020 at 09:48 UTC
In most cases "better" is likely to be subjective. However, I would probably try to loop over $data just once: `use strict; use warnings; use Test::More tests => 1; my $data = [ {a => 1, b => 2}, {a => 3, b => 4}, {a => 5, b => 6} ]; my $want = { a => [1, 3, 5], b => [2, 4, 6] }; my %columns; for my $row (@$data) { for my $k (keys %$row) { $columns{$k} //= []; push @{$columns{$k}}, $row->{$k}; } } is_deeply \%columns, $want;` [download]	[reply] [d/l]
Re^2: Array of Hashes to Hash of arrays for SQL::Abstract by Skeeve (Parson) on Mar 18, 2020 at 10:39 UTC
However, I would probably try to loop over $data just once The disadvantage here is, that you might loose null-values due to missing keys in your data `my $data = [ {a => 1, b => 2, c => 11}, {a => 3, c => 12}, {a => 5, b => 6} ];` [download] `s$$([},&%#}/&/]+}%&{});#$&&s&&$^X.($'^"%]=\&(\|?{%` `+`.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e	[reply] [d/l] [select]
Re^3: Array of Hashes to Hash of arrays for SQL::Abstract by Veltro (Hermit) on Mar 18, 2020 at 16:27 UTC
I think you are right, in such circumstance I think I would use an index: Something like: `for my $i (0..$#{$data}) { foreach ( keys %{$data->[$i]} ) { $columns->{$_}->[$i] = $data->[$i]->{$_} ; } ; }` [download] Also makes it fairly easy to see what happens: `$columns->{$_}->[$i] = $data->[$i]->{$_} ; hash------^-------------------------^ array-----------^-------------^` [download]	[reply] [d/l] [select]
Re^3: Array of Hashes to Hash of arrays for SQL::Abstract by haukex (Archbishop) on Mar 19, 2020 at 09:52 UTC
The disadvantage here is, that you might loose null-values due to missing keys in your data ... which you didn't mention at all in the root node. Changing the specs on us is not nice :-P Is this example you've shown now representative, or are there other edge cases?	[reply]
Re^4: Array of Hashes to Hash of arrays for SQL::Abstract by Skeeve (Parson) on Mar 19, 2020 at 15:13 UTC
Re: Array of Hashes to Hash of arrays for SQL::Abstract by BillKSmith (Monsignor) on Mar 18, 2020 at 17:48 UTC
You have not specified exactly what is required for your 'missing keys' test case. There is a very subtle difference between `$data_hash->{c}->[2] = undef` and `$data)hash->{c}->[2] does not exist`. I assumed that it should be undef. This requires setting the length of the arrays. The following test passes. Try commenting out the statement with '# Set length'. Observe how the test fails. `use strict; use warnings; use Test::More tests=>1; my $data = [ {a => 1, b => 2, c=>11}, {a => 3, c=>12}, {a => 5, b => 6}, ]; my $required = { a => [1, 3, 5 ], b => [2, undef, 6 ], c =>[11, 12, undef], }; my $data_hash; foreach my $i (0..$#$data) { while ((my $key, my $value)= each %{$data->[$i]}) { $#{$data_hash->{$key}} = $#$data; # Set length $data_hash->{$key}->[$i] = $value; } } is_deeply($data_hash, $required, 'missing keys');` [download] Of course, if you require `c => [11, 12]` then the "set length" statement must be omitted. Bill	[reply] [d/l] [select]
Re: Array of Hashes to Hash of arrays for SQL::Abstract by Marshall (Canon) on Mar 18, 2020 at 22:07 UTC
Without any missing keys, the transformation is fairly straightforward. You do not need to create an empty array first. `use warnings; use strict; use Data::Dump qw(pp); my $data = [ {a => 1, b => 2}, {a => 3, b => 4}, {a => 5, b => 6}, ]; # create data struct like this # # my $required = { a => [1, 3, 5], b => [2, 4, 6],}; my %required; foreach my $href (@$data) { push @{$required{$_}}, $href->{$_} foreach keys %$href; } pp \%required; #{ a => [1, 3, 5], b => [2, 4, 6] }` [download] I don't see the requirement for missing keys in the OP? If that is true, then I would probably make 2 passes over $data, the first to discover the "universe of keys" so that appropriate null or undef values can be filled into the arrays as they are generated. There is also the question of what kind of DB you are using and what if any "default" values are specified during table creation? It could also be that SQL::Abstract is not the best way to do what you want? Update: Some code to implement the above. ### assume that some keys could be missing or others added my $data2 = [ {a => 1, }, {a => 1, b => 2}, {a => 3, b => 3, c=>7}, {a => 5, b => 6, c=>8}, {a => 5, c=>8, d=>9}, ]; my %key_universe_h; my $default =0; # determine all possible keys (number of columns) foreach my $href (@$data2) { $key_universe_h{$_}++ foreach keys %$href; } my @key_universe_a = sort (keys %key_universe_h); # proceed as in first example, but with default # for non-existant key values my %required2; foreach my $href (@$data2) { foreach my $column (@key_universe_a) { my $value = $default; $value = $href->{$column} if exists $href->{$column}; push @{$required2{$column}}, $value; } } pp $data2; pp \%required2; [download] Prints: `[ { a => 1 }, { a => 1, b => 2 }, { a => 3, b => 3, c => 7 }, { a => 5, b => 6, c => 8 }, { a => 5, c => 8, d => 9 }, ] { a => [1, 1, 3, 5, 5], b => [0, 2, 3, 6, 0], c => [0, 0, 7, 8, 8], d => [0, 0, 0, 0, 9], }` [download]	[reply] [d/l] [select]
Re: Array of Hashes to Hash of arrays for SQL::Abstract by kcott (Archbishop) on Mar 19, 2020 at 07:46 UTC
G'day Skeeve, "Is there a better way to achieve this transposition?" The following code generates a list of unique keys once and doesn't require initialisation of `%columns`. Note that the source data I've used contains existent and non-existent keys along with defined and undefined values; for future reference, please consider providing more representative data in your OP. `#!/usr/bin/env perl use strict; use warnings; use Data::Dump; my $data = [ { a => 1 }, { a => 3, b => 4 }, { }, { b => 8 }, { a => undef, c => 9 }, ]; my (%columns, %seen_key); my @uniq_keys = grep !$seen_key{$_}++, map keys(%$_), @$data; for my $row (@$data) { push @{$columns{$_}}, $row->{$_} for @uniq_keys; } dd \%columns;` [download] That outputs: `{ a => [1, 3, undef, undef, undef], b => [undef, 4, undef, 8, undef], c => [undef, undef, undef, undef, 9], }` [download] If you want something other than `undef` to represent your null values — a zero-length string for instance — you can change the `push` to this if you have Perl 5.10 or later: `push @{$columns{$_}}, $row->{$_} // '' for @uniq_keys;` [download] For older versions of Perl, you can use this: `push @{$columns{$_}}, defined($row->{$_}) ? $row->{$_} : '' for @uniq_ +keys;` [download] Both of those produce identical output: `{ a => [1, 3, "", "", ""], b => ["", 4, "", 8, ""], c => ["", "", "", "", 9], }` [download] As a side note, code like this: `my $sql= new SQL::Abstract( array_datatypes => 1 );` [download] is discouraged and it is recommended that it be avoided. See "perlobj: Indirect Object Syntax" for details, paying particular attention to the opening, emboldened text: "Outside of the file handle case, use of this syntax is discouraged as it can confuse the Perl interpreter. See below for more details." A better way to write that statement would be: `my $sql = SQL::Abstract::->new(array_datatypes => 1);` [download] — Ken	[reply] [d/l] [select]


Think about Loose Coupling
	PerlMonks