http://qs321.pair.com?node_id=1130851


in reply to How could I simplify this redundant-column-removing code?

This is kind of an interesting problem. If I were being handed this task, my first two questions would be:

  1. Are the fields always in the same order? (in query strings, that's usually not guaranteed)
  2. Does each key appear in every line?

For a real-world task, I'd assume both answers are 'no,' so I'd have to be prepared to handle missing keys (including keys not appearing in the first line), and keys out of order. Given all that, I'd store the keys and values of each line in an array of hashes (an array to maintain the order of the lines). I'd also have a hash for keeping track of all the keys, and another hash for tracking which keys have the same value throughout so they can be dropped from the output. (There may be a clever way to make one hash do both those things, but it didn't occur to me.) This is what I ended up with. It drops the key/value pairs that are always the same (a & e), regardless of order or whether a key is sometimes missing.

One note: I tell the inner split to always produce 2 fields, because otherwise it'll produce undef where there's no value, which messes up the nifty map-to-hash.

#!/usr/bin/env perl use 5.010; use strict; use warnings; use Data::Printer; my @l; # array of hashes, to hold the keys and values for each line my %a; # hash to keep track of all keys my %c; # hash to track whether a key changes value # if a key is still in the hash when the loop finishes, # that means it had the same value throughout and should # be ignored while(<DATA>){ chomp; my %h = map { split /=/,$_,2 } split '&'; # split string into keys +/values push @l, \%h; if( $. == 1 ){ # on the first line, load into %c %c = %h; } else { # on other lines, check values for my $k (keys %h){ delete $c{$k} unless exists $c{$k} and $c{$k} eq $h{$k}; # + remove if different $a{$k}=1; # keep track of key existing } } } # remove consistent keys from output delete $a{$_} for keys %c; for my $l (@l){ for my $k (sort keys %a){ printf "%-8s", $l->{$k} ? "$k=$l->{$k}" : ' '; } say ''; } # I've mixed up the data a bit to reflect my open requirements. # a & e should still be dropped from the output. __DATA__ a=1&b=1&f=3&c=1&d=2&e= a=1&b=2&c=3&e=&f=4 b=2&a=1&c=5&d=1&e=&f=5

Aaron B.
Available for small or large Perl jobs and *nix system administration; see my home node.