Truman is an Open Source sandnet system for automated malware analysis in a "live fire" environment.
Written by Joe Stewart of LURHQ/SecureWorks, it implements an isolated network, complete with a small assortment of simulated or restricted services (DHCP, DNS, HTTP, IRC, MySQL, FTP & SMTP). You provide a malware client box to actually run the samples, connect it to the private network shared with the Truman server, and you're off and running.
Truman System Roles
In a standard Truman setup, there are two computers, a Truman server and a malware analysis client. The server runs on some sort of Linux box (typically), while the malware analysis client is a standard Windows OS build, often Windows 2000 or XP.
The Truman server has several simultaneous roles:
- Provides a DHCP & and a PXE boot environment for the malware client
- Serves as a repository of hard drive images (both a baseline "clean" image and a number of "infected" images)
- Provides or simulates basic network services commonly used by malware
- Acts as a collection point for network-based data (in PCAP format)
- Provides basic analysis tools for examining host-based and network-based data, including an initial automated analysis pass
The Truman Process
Truman is conceptually a very simple system (once it's set up properly, that is). The server stores a complete image of the client's hard drive, which is used both to compare system state before and after infection, and to re-image the client with a clean baseline between each analysis.
Here's how it works. The Truman server has two physical network interfaces. One interface is a private LAN with the malware client, and the other is used to attach it to the analysis network, so the user can log on remotely. In fact, this connection is not required, and Truman can easily be used when it is completely detached for the normal network. The malware client has only the private LAN, and no other network connection.
The malware client is configured to boot from the network via the PXE environment. It receives it's IP address from the server's DHCP service, which also serves up a small Linux-based boot image. As the client can be booted into several modes (e.g., store a baseline image, restore itself from the baseline, create a new infected image and then automatically restore itself to the baseline), the boot environment is responsible for carrying out these functions. In general, the first step in analyzing a binary is to skip these functions and boot the system from the local disk, which is the default mode.
Once Windows boots on the client, the analyst logs in with a special user account that is configured to automatically run the client portion of the analysis script. Basically, this contacts the HTTP service on the Truman server to ask for a binary to analyze. While there are no binaries in the analysis queue, this script loops, waiting 60 seconds between requests. If there is a binary, however, the client downloads it to c:\WINDOWS\system32\sandnet.exe and runs it.
Once the suspect binary is running, the system simply waits for 10 minutes, to allow it time to do its dirty work. When time is up, it saves a copy of the RAM image onto the root of the system drive and then reboots. In the meantime, the TFTP service on the Truman server has magically modified the client's boot state.
This time, when the client boots up, it saves a complete image of the local system partition to the Truman server, then restores itself back to the clean baseline image also stored on the server. Once more, it reboots. This time, the TFTP service has again magically changed the boot state to once more boot from the local disk, and it goes back into Windows and waits for the next analysis.
At this point, it is now time for the analyst to take a manual action. Truman comes with a customizable script to automate the process of collecting information from the new infected image. The analyst simply runs this script, which takes the following actions by default (replace $EXENAME with the name of the infected binary, e.g., malware.exe):
- Creates new directory called /forensics/$EXENAME-files
- Copies the newly created infected image to /images/$EXENAME.img
- Creates lists of all files and registry entries in the new image, and diffs them against the entries in the baseline.
- Retrieves and stores the malware client's pagefile and RAM dump file.
- Stores copies of the logs from the faux services
- Retrieves a PCAP file of all network traffic that occurred during this analysis run
The script itself is customizable, so you can add additional analysis steps yourself. For example, you may wish to scan the image with an antivirus package like ClamAV to try to identify the malware, or you may want to use Wireshark to generate text reports about what protocols are present in the PCAP file. The possibilities are almost literally endless.
At this point, the analyst has a complete mountable image of the system partition, as well as copies of all network traffic and several useful reports to read through. The Truman server has finished it's work, and the malware client has been reset to it's clean baseline, and Truman is ready to analyze the next piece of malware.