Last September, I presented Shellzer at RAID 2011 conference. Shellzer is a tool that I developed back in August 2010, that aims to dynamically analyze malicious shellcode. The main goal was to analyze the shellcode samples that have been collected by running Wepawet during these years. Due to the size of our dataset (about 30,000 shellcode samples at that time), an automated approach was clearly needed.
After trying several approaches and tools, I came across PyDbg, a python Win32 debugging abstraction class. By using it, I started to write my own tool to dynamically analyze a given shellcode. My very first attempt consisted in single-step executing the whole shellcode binary. This resulted in having the complete control over the sample’s execution, and being the shellcode a malicious piece of code, it was an ideal feature. But unfortunately, this approach is not feasible to be used in practice. In fact, the number of assembly instructions that have to be executed at run-time is in the order of millions, even if shellcode is commonly few hundreds of bytes long. This is due to the fact that many loops are present, and some of them are executed thousands of times. Moreover, Windows API functions are invoked by the shellcode. These two factors cause a huge overhead for an approach based on single-stepping, and the analysis was consequently lasting several minutes in average.
My research has been focused to find how to avoid to single-step the whole shellcode’s execution, while maintaining the complete control over it. This has proved to be challenging, due to the many evasion techniques that are used by these pieces of code. If you are interested in the details, please read the paper. The output of the analysis currently consists in the detailed trace of the Windows API functions called (with their parameters and return value), the Windows DLLs that have been loaded, and the list of the URLs contacted by the shellcode. Furthermore, Shellzer supports the analysis of shellcode samples extracted from malicious PDF documents, other than those detected in web-based drive-by-download attacks.
Starting from November 2011, this tool started to be used by Wepawet. When a shellcode is detected, it will be automatically forwarded to the shellcode analyzer and the Shellzer’s report will be included in the main Wepawet’s report. Read this post for more details. Naturally, the tool is not perfect and some samples cannot be analyzed yet. If after submitting a sample to Wepawet, a shellcode is detected and you don’t see the additional shellcode information, it means that something went wrong. Please, don’t hesitate to contact us in case of errors: we need your feedback!