Is there a "Here Table"?

rje has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Is there a "Here Table"? by BrowserUk (Patriarch) on Apr 07, 2015 at 21:07 UTC
It's hard to see the need for a module to do that. Especially as the '----' line is very specific to your data format; thus would need a special case or option. And because it is very simple to do: `#! perl -slw use strict; use Data::Dump qw[ pp ]; my @keys = split ' ', scalar <DATA>; <DATA>; ## discard ----- my @data = map{ my %hash; @hash{ @keys } = split ' '; \%hash; } <DATA>; pp \@data; __END__ Name UPP Age Career Terms -------- ------ --- ---------- ----- Rejnaldi 765987 38 Citizen 6 Lisandra 6779AA 34 Noble 4 Kuran 899786 42 Marine 8` [download] Produces: `C:\test>junk [ { Age => 38, Career => "Citizen", Name => "Rejnaldi", Terms => 6, UP +P => 765987 }, { Age => 34, Career => "Noble", Name => "Lisandra", Terms => 4, UPP +=> "6779AA" }, { Age => 42, Career => "Marine", Name => "Kuran", Terms => 8, UPP => + 899786 }, ]` [download] Of course, someone will complain that it doesn't handle names with spaces, and so you need to switch to fixed field record processing: `#! perl -slw use strict; use Data::Dump qw[ pp ]; my @keys = unpack 'A8xA6xA3xA10xA5', scalar <DATA>; <DATA>; ## discard my @data = map{ my %hash; @hash{ @keys } = unpack 'A8xA6xA3xA10xA5', $_; \%hash; } <DATA>; pp \@data; __END__ Name UPP Age Career Terms -------- ------ --- ---------- ----- Rejnaldi 765987 38 Citizen 6 Lisandra 6779AA 34 Noble 4 Kuran 899786 42 Marine 8` [download] Which produces the same output. But ... they'll say: what if you want to read lots of different files in the same format? Then you need to determine the fields sizes from the data: `#! perl -slw use strict; use Data::Dump qw[ pp ]; my @keys = scalar( <DATA> ) =~ m[(\S+\s*)\s]g; my $templ = join 'x', map{ 'A' . length() } @keys; @keys = map{ $_ =~ s[\s+$][]; $_ } @keys; <DATA>; ## discard my @data = map{ my %hash; @hash{ @keys } = unpack $templ, $_; \%hash; } <DATA>; pp \@data; __END__` [download] Again, same output. But what if the keys can contain spaces? In which case you'll need to use a heuristic approach to locating the field boundaries With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked	[reply] [d/l] [select]
Re^2: Is there a "Here Table"? by jdporter (Paladin) on Apr 07, 2015 at 22:09 UTC
what if the keys can contain spaces? Nah, just cache the first line and parse it after you've read the second line. BTW - the data shown is (obviously?) in the format produced by default by sql select statements, at least for some major RDMS... So you'd think that this would be a "solved problem" by now... I reckon we are the only monastery ever to have a dungeon stuffed with 16,000 zombies.	[reply]
Re^3: Is there a "Here Table"? by jdporter (Paladin) on Apr 08, 2015 at 02:57 UTC
My quick whack at it: sub parse_HERE_table($) { my @lines = split /[\r\n]+/, $_[0]; my $pattern = splice @lines, 1, 1; my $len = length $pattern; $pattern =~ y/-/A/; # to use \b, we need \w chars, and '-' is not +\w. $pattern =~ s/\bA/(A/g; # too bad we can't do s/\</(/g and s/\>/)/ +g :-( $pattern =~ s/A\b/A)/g; $pattern =~ y/A/./; ( my $header, @lines ) = map { [ map { s/\s+$//; $_ } /$pattern/ ] } # parse; trim trai +ling whitespace from each value. map { $_.(' 'x($len-length($_))) } # pad with spaces to ensure + it's long enough to match. grep { /\S/ } # skip blank lines. @lines; [ map { my %r; @r{ @$header } = @$_; \%r } @lines ] } my $arrayref = parse_HERE_table <<EOF; Name UPP Age Career Terms -------- ------ --- ---------- ----- Rejnaldi 765987 38 Citizen 6 Lisandra 6779AA 34 Noble 4 Kuran 899786 42 Marine 8 EOF [download] This code assumes well-formed input. You could certainly add error checking and so on.	[reply] [d/l]
Re^3: Is there a "Here Table"? by BrowserUk (Patriarch) on Apr 07, 2015 at 22:31 UTC
BTW - the data shown is (obviously?) in the format produced by default by sql select statements, at least for some major RDMS... So you'd think that this would be a "solved problem" by now... Well, yes. There is Parse::SQLOutput ... but it won't work for the OPs data. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked	[reply]
Re: Is there a "Here Table"? by MidLifeXis (Monsignor) on Apr 07, 2015 at 20:58 UTC
Not sure if there is an automatic one, but perhaps the CSV DBI module could help in the translation. OTOH, fixed width columns are not that tough to translate. --MidLifeXis	[reply]
Re: Is there a "Here Table"? by erix (Prior) on Apr 07, 2015 at 21:07 UTC
If I remember correctly DBD::AnyData can read that format. I'll give it a try tomorrow if no one else has. update: DBD::AnyData failed to install with either 5.21.11 or 5.20.2, so I'll put no more effort in this.	[reply]
Re: Is there a "Here Table"? by LanX (Saint) on Apr 08, 2015 at 10:38 UTC
Let me take a more abstract approach to your question: First, it's not clear if your question solely concentrates on DB table dumps. Then a module capable of parsing human readable tables wouldn't be restricted to here docs, but capable to parse any string or filehandle. Additionally the choice of resulting data format is not obvious, it depends on the use case. Eg see�Re: Building data structure from multi-row/column table There are plenty of small snippets in the monastery demonstrating how to parse such data. I wouldn't know how to design a generic module which can be customized to handle all cases without requiring coding. (Taking into consideration that csv is an edge case of a table format.) Cheers Rolf _{(addicted to the Perl Programming Language and ☆☆☆☆ :) Je suis Charlie!}	[reply]


Syntactic Confectionery Delight
	PerlMonks