This is kind of an interesting problem. If I were being handed this task, my first two questions would be:
- Are the fields always in the same order? (in query strings, that's usually not guaranteed)
- Does each key appear in every line?
For a real-world task, I'd assume both answers are 'no,' so I'd have to be prepared to handle missing keys (including keys not appearing in the first line), and keys out of order. Given all that, I'd store the keys and values of each line in an array of hashes (an array to maintain the order of the lines). I'd also have a hash for keeping track of all the keys, and another hash for tracking which keys have the same value throughout so they can be dropped from the output. (There may be a clever way to make one hash do both those things, but it didn't occur to me.) This is what I ended up with. It drops the key/value pairs that are always the same (a & e), regardless of order or whether a key is sometimes missing.
One note: I tell the inner split to always produce 2 fields, because otherwise it'll produce undef where there's no value, which messes up the nifty map-to-hash.
#!/usr/bin/env perl
use 5.010; use strict; use warnings;
use Data::Printer;
my @l; # array of hashes, to hold the keys and values for each line
my %a; # hash to keep track of all keys
my %c; # hash to track whether a key changes value
# if a key is still in the hash when the loop finishes,
# that means it had the same value throughout and should
# be ignored
while(<DATA>){
chomp;
my %h = map { split /=/,$_,2 } split '&'; # split string into keys
+/values
push @l, \%h;
if( $. == 1 ){ # on the first line, load into %c
%c = %h;
} else { # on other lines, check values
for my $k (keys %h){
delete $c{$k} unless exists $c{$k} and $c{$k} eq $h{$k}; #
+ remove if different
$a{$k}=1; # keep track of key existing
}
}
}
# remove consistent keys from output
delete $a{$_} for keys %c;
for my $l (@l){
for my $k (sort keys %a){
printf "%-8s", $l->{$k} ? "$k=$l->{$k}" : ' ';
}
say '';
}
# I've mixed up the data a bit to reflect my open requirements.
# a & e should still be dropped from the output.
__DATA__
a=1&b=1&f=3&c=1&d=2&e=
a=1&b=2&c=3&e=&f=4
b=2&a=1&c=5&d=1&e=&f=5
Aaron B.
Available for small or large Perl jobs and *nix system administration; see my home node.
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.