An introduction to malware forensics

This is part 2 in a series investigating a particular piece of malware.

Part 1 looks at how the malware is delivered. It and part 2 were originally a single post, later separated since they look at distinct phases in the attack.
Part 2 analyzes the bot - the agent which turns your computer into a remotely-controlled robot doing the attacker's bidding.
Part 3 dives into the first payload: code to test 30,000 addresses at 5,000 domains, to see if they could be used to send additional spam.

In my last post, we looked at a fairly typical spam message used to deliver malware to unsuspecting users. This message played on psychology (aka social engineering) to trick the reader - a confirmation message for an expensive purchase (in this case, about $1,600), with a link to retrieve the "invoice" (actually the malware). It used Google redirectors to avoid a suspicious-looking link to DropBox or some random web site.

Once the reader clicks the link and allows it to download and run, their computer becomes infected with a botnet agent. In this post, I downloaded the malware into a virtual environment to do some analysis.

Static Analysis

Without running the file, there are some things I can learn about it. Right off the bat, I can note the filename, size, and timestamp, and can generate a couple of hashes that uniquely identify it:

Filename: Package_FLLG.PDF_.scr
File Size: 129,765 bytes
MD5 hash: 7a0055866bcf5d317d692dcf5ba84cac
SHA256 hash: fe137bc4a6f3ae605bb8f3cec7db69543c749ecdce3dc29dabb1183fd3c61ede
Time Stamp: 2014-10-16 21:02:30

A hash is a string value that is unique to the source. As 8 year-old Reuben Paul said at a recent conference, it's a bit like a strawberry smoothie: you put in your ingredients (a file), and run them through the blender (the hash algorithm), and out the other end comes the smoothie (the hash). If the ingredients differ, the end result will differ, and you can never go backwards. There's no way to turn a smoothie back into a carton of strawberries, and there's no way to turn a hash back into the original data. One flaw in this analogy though is that hash algorithms are designed so that the hash differs greatly even if the source file differs only by the tiniest amount, whereas a smoothie will be more or less the same regardless of where the berries came from as long as the berries are relatively similar in variety and freshness.

The first thing I do with a new suspicious file is upload it to VirusTotal to see what about 50 antivirus products think of it. Not surprisingly, on first analysis only 3 AV products identified it as potentially malicious. VT also does some static analysis, and among other things describes this as a Windows PE executable, i.e. a program designed to run within Windows (as opposed to an html web page, which is merely static content that is interpreted and displayed by a browser).

Next I use the utility strings to extract human-readable information from the binary package. Much of the output is gobbledygook, but strings will give a list of functions - named pieces of code that often do something related to their name, and occasionally clues as to the author or a password. In this case one function in particular jumped out at me: "IsDebuggerPresent." This is curious, because I know that some modern malware takes steps to avoid being analyzed by people like me. If a debugger (software used to inspect the inner workings of a program) is present, the malware will either not run at all, or will do something completely benign. In some cases a file will also contain a name or nickname for the author, though that does not appear to be the case with this sample.

Malware in Action

Static analysis is not my strength though, so I turned to what I do know well: the operating system, and the network. I fired up a virtual machine (a virtual computer running within a window on my actual computer - a safe place to run the malware without infecting my real systems) and started to play. I have a variety of tools I like to use to investigate things on a PC. Some of the most useful come from SysInternals, a company acquired by Microsoft. For this analysis, I used Process Explorer (Task Manager on steroids), Process Monitor (which tells me the file, registry, and network interaction of a process), and AutoRuns (which identifies anything set up to run automatically when the computer boots, useful since most malware wants to remain active if you reboot); I also used Wireshark, a program for capturing and analyzing network traffic. Then I launched the malicious executable.

And ... nothing.

The binary executed but did absolutely nothing. No file changes, no registry changes, no network traffic. ProcMon showed it iterate through the registry and several file folders, then just sit there. As I suspected, this piece of malware is alert to not only the presence of a debugger (which I did not use) but also a VM and evidently Wireshark. However its VM detection method appears to be very simplistic, at least for VMWare specifically - it appears to merely detect the vmtoolsd.exe process. I reverted the VM back to a clean state, stopped the VM Tools service and killed the vmtoolsd process. I also did not use Wireshark within the VM, instead using it on my host PC and creating a filter to only capture traffic to and from my VM's address. This time the program executed with more satisfying results.
Fully functioning, the malware copies itself to a hidden file, sets a registry key to cause it to launch every time the system reboots, deactivates the Windows firewall and antimalware software, deactivates User Account Control (Microsoft's somewhat effective method of keeping malware from embedding itself in the operating system), and deletes the visible copy of the file.

AutoRuns shows a couple of possibly-related entries. HKCU\SOFTWARE\Microsoft\Windows\CurrentVersion\Run contains a random-looking value to launch msqnnqx.exe. That to me is a red flag - it's a very common way to force a malicious file to run on every boot. But ... the file does not exist. That's because the malware also copied itself to c:\programdata and added a value to HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\Explorer\Run. This matters because the former (HKCU) applies only to the current user, whereas the latter (HKLM) applies to the entire machine - if any user logs on, the malware runs. The third highlighted line is gathernetworkinfo.vbs, an unsigned, undocumented data gatherer oddly included by Microsoft in Windows 7 systems. It is highlighted because the file does not show a verified publisher - autoruns handily highlights anything without a verified publisher, because most properly-signed files are not malicious.

Behind the scenes, the program begins some rather suspicious network behavior. Every 540 seconds, the system begins by doing a DNS lookup for update.microsoft.com, at DNS server 8.8.4.4. This is interesting, since my DHCP gives the DNS server of OpenDNS, 208.67.220.123. Nothing in my VM has any reason to query 8.8.4.4 (one of Google's public DNS servers). Perhaps this is a basic "am I online?" test to make sure of connectivity before doing anything more obviously suspicious. Upon a successful DNS lookup, the system establishes and then terminates a TCP connection with the IP address given. I'm not really sure what what to make of this, other than perhaps it's a further connectivity test, to be sure it's not in an isolated environment with a fake DNS provider.

Here is where it really gets interesting though. With this out of the way, the system now connects to port 8080 on 5.63.155.195, then generates an HTTP post request to /home.php on that address, with one parameter:

Key=gcM/h1edMRTQwU5M5qyEzcer UNZmSHwbXpIf4RUZxqhXw8gjnmoZ3XFOqae

The response is an HTTP 500 server error (which reveals that the attacker is running PHP 5.5.16-1~dotdeb.1, or PHP version 5.5.16-1 on Debian Linux. On the surface I'm not sure what this means. It could be a mistake, it could be that the attacker is rotating among several servers, or it could be yet another connectivity test. Regardless, I'll file this knowledge away for later - perhaps there are known vulnerabilities in that version of PHP that I could later exploit?

Immediately after this error response, the system does the same with 185.20.226.41. This time the response is HTTP 200 OK, but with no content other than a blank line.

Of note: both IP addresses are assigned to the same ISP, in Russia. Hmm...

At this point, I am in a holding pattern. This malware is reaching out to two servers in Russia, probably looking for instructions. Various AV vendors identify this as either a remote access bot or trojan (likely) or a ransomware variant - malware that encrypts all data on a system and then holds it ransom until and unless you pay for the unlock key. More than likely the ransomeware, if any, is to be delivered later to the bot client. The key value that my VM is sending is likely some unique identifier, possibly identifying this system for use in a botnet, possibly identifying it so an attacker can provide the correct decrypt key upon paying a ransom. At this point I don't know, because it has not actually done anything. That said, several variants of ransomware malware have been known to wait up to 24 hours before encrypting, probably to separate in the end user's mind the visible infection from the act of clicking on the link. We shall see...

The story continues in part 3, where I look into instructions received from the bot master.

Wednesday, October 22, 2014

An introduction to malware forensics

Whois David?