When talking about new concepts, its always best to demonstrate them on something everyone is familiar with. In our case that's of-course UPX with which we are fairly familiar. It almost feels like we write one UPX unpacker each week, doesn't it?
Today we are presenting an optimization concept that enables us to unpack everything in a single go. Now, when talking about file unpacking we always unpack everything in one go, but we never unpack both the main executable module and all of its packed dependencies in a single run. Normally, you wold do this by batching through individual files. But from a speed perspective, the best optimization imaginable comes from unpacking the main module and all of its dependencies at once. Since TitanEngine wasn't really designed to do that out-of-the-box, it needs just a little bit of help to pull it off.
The problem is the existence of multiple relocation tables, and more importantly multiple import tables. Since TitanEngine was designed to unpack files one at the time, we must do some additional coding around these boundaries to achieve our goal. Compared to a traditional TitanEngine dynamic unpacker, the only difference is the need to collect import table data for modules in one place, and use that data for any module that has reached its entry point jump. The UPX is a special case because it always imports packed file dependencies through the import table. This is, of course, a static way of importing libraries but our approach must be flexible enough to cover both dynamic and static importing.
To achieve our goal we have to scan the main module and all loaded libraries and try to find the appropriate patterns. Once the patterns are found, we set breakpoints and store info about them so we know which module triggered which callback event. Normally we have three callbacks for UPX unpackers (LoadLibrary, GetProcAddress and EP jump) but since we are doing transverse unpacking we need one more: the load library event custom handler, which determines whether the loaded dependencies are packed with UPX by trying to find the neccessary breakpoint patterns. Even though it is impossible to have more than one module loading at a time, we still need to store the import data because the import tables for the main executable and dependencies might overlap if the modules are loaded dynamically. Once stored, the import info for each module is retrieved when it hits its entry point callback. Relocations aren't really a problem since there is just one module loading at a time, so we can use our "snapshot and compare" model, provided that modules load on non-default image bases. This can be done in numerous ways - one of the easiest is to compile the sample files so that they do that by default (which is considered cheating in the unpacking game), alternatively, we can pre-allocate the memory so that the modules have no choice but to pick another base address. For the purpose of this blog we cheated, in a real world application of this approach you mustn't.
In the real world you will hardly ever see this kind of case but if you do, you now know how to get everything in one go. Until next week...
(package contains the unpacker with source and the samples used)