http://qs321.pair.com?node_id=1162844


in reply to Tallying overall frequency of characters in a set of strings by position

Just for interest, here's a Perl 6 solution that works for arbitrary sets of input characters and arbitrary input lengths:
#! /usr/bin/env perl6 use v6; my @data = < AABBC BAABC AABBD AACBB >; # Use mixhashes (self-totalling, and they default to zero) my @freq = MixHash.new xx max @data».chars; # Count everything for @data».comb -> @chars { for @chars.kv -> $pos, $char { @freq[$pos]{$char}++; } } # Column labels my @labels = @data.join.comb.unique.sort; say join "\t", '', @labels; # Table rows for @freq.kv -> $pos, %score { say join "\t", ($pos+1).fmt("%2d"), %score{@labels}.map( * / %score.total )».fmt("%.2f") }
...and the output:
A B C D 1 0.75 0.25 0.00 0.00 2 1.00 0.00 0.00 0.00 3 0.25 0.50 0.25 0.00 4 0.00 1.00 0.00 0.00 5 0.00 0.25 0.50 0.25
Or with:
my @data = < AABBC BAABC AABBD AECBBF >;
...you get:
A B C D E F 1 0.75 0.25 0.00 0.00 0.00 0.00 2 0.75 0.00 0.00 0.00 0.25 0.00 3 0.25 0.50 0.25 0.00 0.00 0.00 4 0.00 1.00 0.00 0.00 0.00 0.00 5 0.00 0.25 0.50 0.25 0.00 0.00 6 0.00 0.00 0.00 0.00 0.00 1.00