0. Motivation
Three reasons to deobfuscate vladb's signature: fun and profit, as
pointed out by jmcnamara; vladb monkself said What I think
would be useful, in addition to your idea, is to have more authors
of original obfuscations to submit a link somewhere in their post
to a spoiler page but the only spoiler I could find relating to
the sig was in the original post; finally, vladb said
that he is still ...not able to de-obfuscate much of the code...
me neither, and I need to start somewhere! :)
====
1. The original
$"=q;grep;;$,=q"grep";for(`find . -name ".saves*~"`){s;$/;;;/(.*-(\d+)
+-.*)$/;
$_=["ps -e -o pid | "," $2 | "," -v "," "];`@$_`?{print"+ $1"}:{print"
+- $1"}&&`r
m $1`;
print$\;}
====
2. Apply a bit of formatting. (Or, in restrospect, why I'm a lamo.)
I'm unsure whether that grep;;$,=q"grep"; should really be one line
or two... I'll assume one for now.
$"=q;
grep;;$,=q"grep";
for(`find . -name ".saves*~"`){
s;$/;;;
/(.*-(\d+)-.*)$/;
$_=["ps -e -o pid | "," $2 | "," -v "," "];
`@$_` ? {print"+ $1"} : {print"- $1"} && `rm $1`;
print $\;
}
====
3. Add in some line numbering.
1: $"=q;
2:
3: grep;;$,=q"grep";
4:
5: for(`find . -name ".saves*~"`){
6: s;$/;;;
7: /(.*-(\d+)-.*)$/;
8: $_=["ps -e -o pid | "," $2 | "," -v "," "];
9: `@$_` ? {print"+ $1"} : {print"- $1"} && `rm $1`;
10: print $\;
11: }
====
At first glance, line 1 seems to set $", the quoted-array-seperator,
to the letter 'q'. The tinkering with $" makes me think that vladb
is going to use an array somewhere later on; and thinking that
$"=q; meant $"='q';, one might start thinking
about the upcoming array...
I'm kind of slow with coding, so just ran a quick oneliner to see whether
the first line really does what I thought:
perl -e '@a = qw (a bc def); print "before: @a\n"; $"=q; print "after:
+ @a\n";'
Output:
before: a bc def
This is surprising; obviously two things are happening here:
1. $"=q; is actually using the q operator to quote, umm,
something...
2. For some reason, when $" is set to that, err, something, the second
print statement fails.
Line 1 confused me! B::Deparse gets a lot of mention in the
monastery; maybe it can help me here.
perl -MO=Deparse -e '@a = qw (a bc def); print "before: @a\n"; $"=q; p
+rint
"after: @a\n";'
Output:
@a = ('a', 'bc', 'def'); # ok, I agree
print "before: @a\n"; # yep, still with ya
$" = ' print "after: @a\\n"'; # whoa, this is unexpected
-e syntax OK # good news, I guess
====
Aha, after playing around above, reading perlop, and looking back at
my first step, I see where vladb led me astray: I split up his code
wrong!
$"=q;grep;;$,=q"grep";
should actually be broken up like this:
$"=q;grep;;
$,=q"grep";
which is the equivalent of this:
$" = 'grep';
$, = 'grep';
Cut down a tree with a herring? Sure, I'll try, but only if it's red...
====
If I had been running the modified signature as I, um, modified it,
I would've caught my mistake sooner. As it is, vladb's misdirection waylaid me for an hour (actually, I gave up, but while eating lunch figured out my mistake). But this time gave me a chance to read up on $" and $, in perlvar.
I'm a big fan of $", actually; I do this in oneliners a lot:
$"=$/; print "@a\n"; which prints the elements of
@a on their own line. ($/, incidentally, is the "input record
separator"; the default character is \n.)
But I don't use $, very much at all. This turns
out to be useful as well: if one has code such as
$, = '|'; print $foo, $bar, $baz, "\n" then one can
generate nicely formatted (in this case, pipe-delimited) lines
without having to muck around with the equivalent printf statement.
====
4. Code re-write, using discovery above
1: $"= 'grep' ;
2: $,= 'grep' ;
3:
4: for(`find . -name ".saves*~"`){
5: s;$/;;;
6: /(.*-(\d+)-.*)$/;
7: $_=["ps -e -o pid | "," $2 | "," -v "," "];
8: `@$_` ? {print"+ $1"} : {print"- $1"} && `rm $1`;
9: print $\;
10: }
====
Whew! At this point, we've only looked at the first two lines of
code! Fortunately, lines 4-7 are fairly straightforward.
Line 4: Setting $" and $, to 'grep' is a clue that vladb's signature
is a Unix utility of some sort; the for ( `find ...`) clinches
it. (Uhh, not to mention the original description!)
line 4 runs a shell command (the Unix command "find") and foreach line
that is returned, processes them according to lines 5-9.
This particular find command is going to search the current directory (and,
for some implementations of find, subdirectories) for files that match a
particular naming convention. The regex for these filenames would be
something like /^\.saves.*?~$/, if that helps you. Otherwise, here's a few
examples:
foo.saves_blah~ # no match
.saves_foo # no match
.saves_foo~ # match!
On Unix and Linux, a filename that starts with a dot (.) is a "hidden"
file, which can only be seen if you use an extra flag on 'ls' (same
function as DOS 'dir' command). So the find command is going to find a
bunch of "hidden" files that start '.saves', continue with whatever
text describes what file is saved, and end with a tilde (~). An example
might be .saves_Big_Project_backup_27~
Chances are you don't have any of these in your directory on your
machine, so the find command would return nothing. And with no data
to apply the for block to, perl just skips the block in totum.
====
Well, that's pretty boring stuff. I wonder what happens when vladb
uses this tool on his machine? Presumably the find command returns some
data, so lines 5-9 get to kick in.
Line 5: I didn't bother reformatting this; we can do so now.
s;$/;;;
In a substitution (s///), one can choose an alternate delimiting character.
This is useful if you have a lot of '/' that you are processing, and find
yourself escaping them all the time: '\/'. Consider if you wanted to
remove all '//' from a line:
s/\/\///g;
versus
s,//,,g;
Notice how much cleaner the second form is.
vladb is doing the same thing: using an alternate delimiter on his
s///. He's using ';', though, because he figures that he might be able
to catch overzealous deobfuscators out a second time (remember the "a lamo"!)
But we're on to his semicolon madness, and know immediately that line 5
is globally removing all $/ characters from $_ - and since $/ defaults to
\n, and since vladb hasn't changed it, we know we're really removing all
newline characters from $_. find is only going to return one newline per
line of output - this makes sense - so really line 5 is the same as
chomp;
====
Line 6 is a simple pattern match: perl actually lets you comment your
regexes if you want, so let's try that out.
/(.*-(\d+)-.*)$/;
becomes
/ # start of pattern match
( # begin storing into $1
.* # store any number of any character...
- # ...followed by a hyphen...
( # begin storing into $2
\d # ...any digit...
+ # ...as many as we can grab...
) # stop storing into $2
- # ...followed by another hyphen
.* # ...followed by any number of any character..
+.
) # stop storing into $2
$ # end of the line, bub
/x; # / to terminate regex, x to allow comments
Right away this tells me that I'd misguessed the naming convention that
vladb is using: my previous example, .saves_Big_Project_backup_27~,
wouldn't have succeeded at all: the regex says there must be a hyphen,
some digits, and a hyphen; the example actually doesn't have any hyphens surrounding the digits. (Oh well, the example served its purpose: to get me thinking about the data.)
The naming convention is probably .saves-$$-~ where "$$" is the process
id number of the program that created the save file. Putting the
process id, or pid, into a temporary file's name is useful for two
reasons: first, generally your OS doesn't cycle pids very quickly, so it's a lazy
way of making sure your temp file names are unique; second, you can
identify the owner of the temp file, and if the owner isn't running
anymore, you can remove the old file.
(Which, if you read vladb's description, is exactly what
this utility does!)
====
Line 7 made my eyes water. It looks like a shell command is being built,
but to do what? Remember that line 6 stuffed a pid into $2. Line 7 is
going to use that stored data and build a ps command that checks whether
that pid is still around.
$_=["ps -e -o pid | "," $2 | "," -v "," "];
First off, we've got what I call "the anonymous array square brackets".
(It ain't catchy but it sure helps me remember what they do.)
If we de-obfuscate this line a bit, it becomes:
@command = "ps -e -o pid | grep $2 | grep -v grep ";
$ar_command = \@command;
Where did those 'grep's come from? Remember back to line 2:
$, = 'grep';
So where you see a comma in line 7, you can mentally think "grep"
instead.
But what does the @command do? Let's look.
ps -e -o pid # use the 'ps' command to look at the process stack;
# the -e flag says to look at all running processes;
# the '-o pid' flag specifies to return their process ids.
| # take the output from the previous command and use it
# as input for this next command
grep $2 # look for the pid that we found in line 6; this pid,
# remember, comes from the tempfile name, and tells us
# who the owner of $_ is.
| # take the output from the previous command and use it
# as input for this next command
grep -v grep # Right now there might be two lines in the process stack
# that have $2 in them: first is our grep line from earlier
# in this pipeline; second is the process whose pid really
# is $2. We want to ignore the grep lines; this way we avoid
# a situation where we see $2 in the process stack and think
# it's the process we're looking for when really it's just us!
In line 8 vladb will actually run this command; for now if you only take
one thing away from this, it should be this: the output of the command
will be either 0 lines of data, in which case the process isn't running,
or it will be 1 line of data, in which case the process still is running.
Remember, though, that vladb didn't want to give away the whole bag at
once, so instead of writing:
$_ = "ps -e -o pid | grep $2 | grep -v grep";
he instead wrote
$_=["ps -e -o pid | "," $2 | "," -v "," "];
And one of the consequences of this is that $_ isn't actually the full
command that we want; it's a pointer to an anonymous array- the anonymous array is what contains the real command!p>
====
So in Line 8, when vladb actually want to check the process stack for
those running processes, he must first dereference the array.
As TheDamian wrote in an oldish article archived at
perl.com, ...A reference is like the traditional Zen idea of the "finger pointing at the moon". It's something that identifies a variable, and allows us to locate it. And that's the stumbling block most people need to get over: the finger (reference) isn't the moon (variable); it's merely a means of working out where the moon is.
(n/b if you haven't searched perl.com for your favorite authors and
personalities that hang out on perlmonks: why haven't you? Many have written
articles that will improve your understanding and use of perl almost within
seconds of reading!)
The dereferencing is done in line 8 by simply tossing an at-sign, @, in
front of $_.
Like line 4, line 8 uses backticks `` to run an external command and
feed its output back into the program. We know from the discussion of
line 7 what the command is - a search of running process IDs - and what
the expected output is (either nothing or a process ID).
Line 8 also uses a ternary conditional: this is a fancy way of writing
an if-else statement in just one line.
Consider:
perl -e '$foo = 0; $foo==0 ? print "foo is zero" : print "foo is non-z
+ero";'
Output:
foo is zero
This is the same as:
$foo = 0;
if ( $foo == 0 ) {
print "foo is zero" ;
} else {
print "foo is non-zero" ;
}
We can re-write line 8 a little:
if ( `@$_` ) {
print "+ $1";
} else {
print "- $1" && `rm $1`;
}
And we can re-write it a little more:
my $owner_is_still_running = `@$_`; # search for a specifi
+c $pid
if ( $owner_is_still_running ) {
print "$owner_is_still_running, keeping $1"; # found $pid, keep tem
+pfile
} else {
print "removing $1"; # didn't find $pid
`rm $1`; # remove $pid's tempfi
+le
}
====
Line 9 prints $\, which, umm, defaults to nothing; here it looks
like it's being treated as a newline, though, doesn't it? I've
gotta admit: I'm not sure where $\ gets set to \n...
====
Another thing I'm not sure of is why $" was set to 'grep'; this
seems like a bit of misdirection on vladb's behalf. After all,
he only builds one array - in line 7 - and never double-quotes
it. So as far as I can tell, $" never gets used.
====
And for my own fun, here's the de-obfuscated tool.
#!/usr/bin/perl
use strict;
use warnings;
# find files that match the naming convention
my @files = `find . -name ".saves*~"`;
foreach ( @files ) {
chomp;
# hold onto filename, and extract creator's pid
/ # start of pattern match
( # begin storing into $1
.* # store any number of any character...
- # ...followed by a hyphen...
( # begin storing into $2
\d # ...any digit...
+ # ...as many as we can grab...
) # stop storing into $2
- # ...followed by another hyphen
.* # ...followed by any number of any character...
) # stop storing into $2
$ # end of the line, bub
/x; # / to terminate regex, x to allow comments
my ( $filename, $creator_pid ) = ( $1, $2 );
# Check process stack for the creator's pid, storing command result
my $command = "ps -e -o pid | grep $creator_pid | grep -v grep";
my $command_result = `$command`;
# if the command result is positive, leave $filename alone...
# ... otherwise, remove $filename
if ( $command_result ) {
print "+ $filename\n";
} else {
print "- $filename\n";
`rm $filename`;
}
}
====
Summary
Hopefully this will be useful to some other monks as an example of how to start de-obfuscating. This is my first turn at writing a spoiler, and I gotta admit: it was pretty fun to figure this stuff out. Although (because?) I made a few wrong turns in my assumptions about the code, this exercise also helped me learn a little bit more about Perl. Thanks jmcnamara for the thread and vladb for the spoiler opportunity.
blyman
setenv EXINIT 'set noai ts=2' |