Heres a vivisection of this japh, for those that have asked. It works by decoding Perl (or any ASCII text) from one of the two interwoven DNA strands, with every four nucleotides representing one ASCII character (unlike real DNA which uses three nucleotides per codon, ASCII requires four, since 4^4 == 256). Because DNA is "mirrored" on each strand (T's to A's and G's to C's), we only use one of the two strands. First, we'll rewrite the code in a more readable format:
# "$ _" is really "$_", and change the qq to a double-quote
$_ =
"CG
T--A
A---T
A----T
C----G
T----A
A---T
and
so
on
";
@_{A => C => G => T => } = 0..3;
s|.*(\w).*(\w).*\n|$_{$-++ / 9 % 2 ? $2:$ 1}|gex;
s|(.)(.)(.)(.)|chr (64*$1 + 16*$2 + 4*$3 + $4)|gex;
eval
Next, we'll make sense of the following line of code:
@_{A => C => G => T => } = 0..3;
# is really...
@_{'A', 'C', 'G', 'T'} = 0..3;
This is a hash slice notation that sets the A,C,G, and T keys of the %_ hash to their numeric value counterparts 0,1,2, and 3 respectively. The use of the digraph => operator allows for making strings of barewords.
The next line of code transforms the chromosome into a series of Base4 digits, by substituting the appropriate digit for each line:
s|
.* # greedily match
(\w) # match first letter, and store into $1
.* # greedily match
(\w) # match last letter, and store into $2
.*\n # eat up remainder of line
|
# this expression maps the relevant character to its Base4 digit f
+rom
# the %_ hash. The $- is used as a line counter (it defaults to 0)
+. When
# the DNA strands flip positions, this continues decoding on the c
+orrect
# strand (see physi's comment for a visual representation of this)
$_{$-++ / 9 % 2 ? $2:$ 1}
|gex;
After this substitution, $_ looks something like 010210320210103203.... All that's left to do is to transform each sequence of four Base4 digits into their ASCII representation:
s|
# store next four characters into $1,$2,$3, and $4
(.)(.)(.)(.)
|
# replace with a Base4-to-ASCII conversion of those characters
chr (64*$1 + 16*$2 + 4*$3 + $4)
|gex;
After this, $_ contains our decoded code: print"Just Another Perl Hacker\n". The string is at last eval'd, and japhage occurs.
And, in case anyone's interested, here's the corresponding Perl-to-DNA encoder:
use strict;
my $BASE = 4;
my %NUC_PAIRS = (
A => T =>
C => G =>
G => C =>
T => A =>
);
my @DIGIT_TO_NUC = qw( A C G T );
my $FMT_DNA = <<END;
01
0--1
0---1
0----1
0----1
0---1
0--1
01
10
1--0
1---0
1----0
1----0
1---0
1--0
10
END
my @FMT_DNA = split "\n",$FMT_DNA;
my $str = 'print"Just Another Perl Hacker\n"';
my @str_digits;
for (split//, $str) {
my $ord = ord($_);
my @digits = (0) x 4;
print "$ord:\t";
my $i = 0;
while ($ord) {
$digits[4 - ++$i] = $ord % $BASE;
$ord = int ($ord / $BASE);
}
print "@digits\n";
push @str_digits, [@digits];
}
my $i = 0;
for (@str_digits) {
for (@$_) {
my $fmt = $FMT_DNA[$i++ % @FMT_DNA];
my $nuc0 = $DIGIT_TO_NUC[$_];
my $nuc1 = $NUC_PAIRS{$nuc0};
$fmt =~ s/0/$nuc0/;
$fmt =~ s/1/$nuc1/;
print "$fmt\n";
}
}
MeowChow
s aamecha.s a..a\u$&owag.print |