Archive

Archive for December, 2015

Heap Tracking

December 23rd, 2015 No comments

This post will cover the topic of finding and inspecting differences in a process heap over time. It will cover two techniques: a non-invasive one that iterates and copies heap entries from a separate process, and an invasive one that uses dynamic binary instrumentation to track all heap writes. Heap tracking is useful if you want to monitor large scale changes in an application over time. For example, looking at the state of the heap and potentially what data structures were modified after pressing a button or performing some complex action.

Non-invasive Heap Diffing

The non-invasive technique relies on remotely reading every allocated heap block in a target process and copying the bytes to the inspecting process. Once this iteration is done, a snapshot of the heap will be created and can then be accurately diffed against another snapshot at a later point in time to see how the heap state changed. This traversal is accomplished with the HeapList32First/HeapList32Next and Heap32First/Heap32Next functions from the Toolhelp API. The traversal code is shown below:

const Heap EnumerateProcessHeap(const DWORD processId, const HANDLE processHandle)
{
    HANDLE snapshot = CreateToolhelp32Snapshot(TH32CS_SNAPHEAPLIST, processId);
    if (snapshot == INVALID_HANDLE_VALUE)
    {
        fprintf(stderr, "Could not create toolhelp snapshot. "
            "Error = 0x%X\n", GetLastError());
        exit(-1);
    }
 
    Heap processHeapInfo;
 
    (void)NtSuspendProcess(processHandle);
 
    size_t reserveSize = 4096;
    std::unique_ptr<unsigned char[]> heapBuffer(new unsigned char[reserveSize]);
 
    HEAPLIST32 heapList = { 0 };
    heapList.dwSize = sizeof(HEAPLIST32);
    if (Heap32ListFirst(snapshot, &heapList))
    {
        do
        {
            HEAPENTRY32 heapEntry = { 0 };
            heapEntry.dwSize = sizeof(HEAPENTRY32);
 
            if (Heap32First(&heapEntry, processId, heapList.th32HeapID))
            {
                do
                {
                    if (IsReadable(processHandle, heapEntry.dwAddress, heapEntry.dwSize))
                    {
                        ReadHeapData(processHandle, heapEntry.dwAddress, heapEntry.dwSize,
                            processHeapInfo, heapBuffer, reserveSize);
                    }
 
                    heapEntry.dwSize = sizeof(HEAPENTRY32);
                } while (Heap32Next(&heapEntry));
            }
 
            heapList.dwSize = sizeof(HEAPLIST32);
        } while (Heap32ListNext(snapshot, &heapList));
    }
 
    (void)NtResumeProcess(processHandle);
 
    (void)CloseHandle(snapshot);
 
    return processHeapInfo;
}

For every heap list and subsequent heap entry, the heap block is read and its byte contents stored in an address -> byte pair. The remote read is just a call around ReadProcessMemory

void ReadHeapData(const HANDLE processHandle, const DWORD_PTR heapAddress, const size_t size, Heap &heapInfo,
    std::unique_ptr<unsigned char[]> &heapBuffer, size_t &reserveSize)
{
    if (size > reserveSize)
    {
        heapBuffer = std::unique_ptr<unsigned char[]>(new unsigned char[size]);
        reserveSize = size;
    }
 
    SIZE_T bytesRead = 0;
    const BOOL success = ReadProcessMemory(processHandle, (LPCVOID)heapAddress, heapBuffer.get(), size, &bytesRead);
 
    if (success == 0)
    {
        fprintf(stderr, "Could not read process memory at 0x%p "
            "Error = 0x%X\n", (void *)heapAddress, GetLastError());
        return;
    }
    if (bytesRead != size)
    {
        fprintf(stderr, "Could not read process all memory at 0x%p "
            "Error = 0x%X\n", (void *)heapAddress, GetLastError());
        return;
    }
 
    for (size_t i = 0; i < size; ++i)
    {
        heapInfo.emplace_hint(std::end(heapInfo), std::make_pair((heapAddress + i), heapBuffer[i]));
    }
}

At this point a snapshot of the heap is created. A screenshot of an example run shows the address -> byte pairs below.heapdiff

The next part is to take another snapshot at a later point in time and begin diffing the heaps. Diffing the heaps involves three scenarios: when a heap entry at the same address has changed, when an entry was removed (in first snapshot but not in second), and when a new allocation was made (in second heap snapshot but not in first). The code is pretty straightforward and performs a search and compare in the first heap against the second heap.

const HeapDiff GetHeapDiffs(const Heap &firstHeap, Heap &secondHeap)
{
    HeapDiff heapDiff;
 
    for (auto &heapEntry : firstHeap)
    {
        auto &secondHeapEntry = std::find_if(std::begin(secondHeap), std::end(secondHeap),
            [&](const std::pair<DWORD_PTR, unsigned char> &entry) -> bool
        {
            return entry.first == heapEntry.first;
        });
 
        if (secondHeapEntry != std::end(secondHeap))
        {
            if (heapEntry.second != secondHeapEntry->second)
            {
                //Entries in both heaps but are different
                heapDiff.emplace_hint(std::end(heapDiff),
                    heapEntry.first, std::make_pair(heapEntry.second, secondHeapEntry->second));
            }
            secondHeap.erase(secondHeapEntry);
        }
        else
        {
            //Entries in first heap and not in second heap
            heapDiff.emplace_hint(std::end(heapDiff),
                heapEntry.first, std::make_pair(heapEntry.second, heapEntry.second));
        }
    }
 
    for (auto &newEntries : secondHeap)
    {
        //Entries in second heap and not in first heap
        heapDiff.emplace_hint(std::end(heapDiff),
            newEntries.first, std::make_pair(newEntries.second, newEntries.second));
    }
 
    return heapDiff;
}

A screenshot post-diff is shown below:

heapdiff1

Looking at the above example, you can see that the bytes at heap address 0x003F0200 changed from 0x2B to 0x57, among many others. The last step is to merge contiguous blocks to make things more simple. The code is omitted here, but a final screenshot is shown below showing the final structure of the heap diff.heapdiff2The diff can be inspected for anything deemed interesting and can aid in reverse engineering an application. For example, to see where text is drawn in a text editor, you can write some text in the editor and take a snapshot

heapdiff3Prior to taking a second snapshot, change some of the text around and inspect the heap differences. For this example, some AA‘s were changed to BB.heapdiff4The heap contents beginning at 0x0079201C contained the text and were noted as changing from A -> B. Attaching a debugger and setting a breakpoint on-write at 0x0079201C showed an access from 0x00402CA5, which is a rep movs instruction responsible for copying the text to draw into the buffer.heapdiff5heapdiff6
The usefulness of this technique is obviously predicated on the desired data to reside in the process heap.

Invasive Heap Diffing

The technique described above is useful because it does not disturb the process state, aside from suspending and resuming it. The inspecting process has no access to the address space of the target process and performs all of its actions remotely. This next technique uses Intel’s Pin dynamic binary instrumentation platform to instrument a target process and monitor only heap writes. This means that, unlike the previous technique, the entire state of the heap does not need to be tracked. Pin allows for tracking of memory writes in a process, among many other things. Pin is injected as a DLL into a process, so all code written within it will have access to the process address space. That means that instead of traversing heap lists and heap entries, the HeapWalk function can be used directly to get all valid heap addresses.

In the example, all current heap addresses are kept in a std::set container. These are retrieved when the DLL is loaded in the process and instrumentation beings:

void WalkHeaps(WinApi::HANDLE *heaps, const size_t size)
{
    using namespace WinApi;
 
    fprintf(stderr, "Walking %i heaps.\n", size);
 
    for(size_t i = 0; i < size; ++i)
    {
        if(HeapLock(heaps[i]) == FALSE)
        {
            fprintf(stderr, "Could not lock heap 0x%X"
                "Error = 0x%X\n", heaps[i], GetLastError());
            continue;
        }
 
        PROCESS_HEAP_ENTRY heapEntry = { 0 };
        heapEntry.lpData = NULL;
        while(HeapWalk(heaps[i], &heapEntry) != FALSE)
        {
            for(size_t j = 0; j < heapEntry.cbData; ++j)
            {
                heapAddresses.insert(std::end(heapAddresses),
                    (DWORD_PTR)heapEntry.lpData + j);
            }
        }
 
        fprintf(stderr, "HeapWalk finished with 0x%X\n", GetLastError());
 
        if(HeapUnlock(heaps[i]) == FALSE)
        {
            fprintf(stderr, "Could not unlock heap 0x%X"
                "Error = 0x%X\n", heaps[i], GetLastError());
        }
    }
 
    size_t numHeapAddresses = heapAddresses.size();
    fprintf(stderr, "Found %i (0x%X) heap addresses.\n",
        numHeapAddresses, numHeapAddresses);
 
}

An instrumentation function is then added, which is called on every instruction execution:

INS_AddInstrumentFunction(OnInstruction, 0);

The OnInstruction function checks to see if it is a memory write. If it is then a call to our inspection function is added and subsequently invoked. This function checks if the address that is being written to is in the heap and logs it if that is the case.

VOID OnMemoryWriteBefore(VOID *ip, VOID *addr)
{
    if(IsInHeap(addr))
    {
        fprintf(trace, "Heap entry 0x%p has been modified.\n", addr);
    }
}

Testing this is pretty simple; create a simple application that allocates some data on the heap and performs constant writes to it:

int main(int argc, char *argv[])
{
    int *heapData = new int;
    *heapData = 0;
 
    fprintf(stdout, "Heap address: 0x%p", heapData);
 
    while(true)
    {
        *heapData = (*heapData + 1) % INT_MAX;
        Sleep(500);
    }
 
    return 0;
}

Running the instrumentation against the a compiled version of the code above produces the following output, showing successful instrumentation and heap tracking.heapdiff9

Heap entry 0x007B4B58 has been modified.
Heap entry 0x007B4B6C has been modified.
Heap entry 0x007B4B70 has been modified.
Heap entry 0x007B4B68 has been modified.
Heap entry 0x007B27C8 has been modified.
Heap entry 0x007B27C8 has been modified.
Heap entry 0x007B27C8 has been modified.
Heap entry 0x007B27C8 has been modified.
Heap entry 0x007B27C8 has been modified.
...

The Pin framework provides a lot more functionality than what is covered in the example code provided. The code can further be expanded to disassemble and interpret the writing address and get the current heap value and the value that will be written as in the first example.

Final Notes

This post presented a couple of techniques for finding differences in process heaps. The example code shows basic examples, but has some issues in terms of scaling; a 100MB heap diff takes about 15 minutes with the current implementation due to the large number of lookups. The code should serve as a good starting point to build on if the target application allocates a large amount of heap space.

Code

The Visual Studio 2015 project for this example can be found here. The source code is viewable on Github here. Thanks for reading and follow on Twitter for more updates.

Runtime DirectX Hooking

December 14th, 2015 1 comment

This post will cover the topic of hooking DirectX in a running application. This post will cover DirectX9 specifically, but the general technique applies to any version. A previous and similar post covered virtual table hooking for DirectX10 and DirectX11 (with minor adjustments). Unlike the previous post, this one aims to establish a technique to hook running DirectX applications. This means that it can be installed at any time, unlike the previous technique, which required starting a process in a suspended state and then hooking to get the device pointer.

Motivations

The motivations are similar to the previous post. By hooking the DirectX device, we can inspect or change the properties of rendered scenes (i.e. depth testing, object colors), overlay text or images, better display visual information, or do anything else with the scene. However, to achieve anything beyond the basics, it also takes a lot of effort in reverse engineering the actual application; simply having access to the rendered scene won’t get you too far.maxresdefault

An example of DirectX hooking to make certain models have a bright color, and to allow seeing of depth through objects that obstruct a view.

SC2Console

An example of outputting reverse engineered data from a client and overlaying it as text in the application. This is a pretty awesome project whose description and source code is available here.
Techniques

Typically when hooking DirectX, there are several popular options:

  • Hook IDirect3D9::CreateDevice and store the IDirect3DDevice9 pointer that is initialized when the function returns successfully. This needs to be done when the process is started in a suspended state, otherwise the device will have already been initialized.
  • Perform a byte pattern scan in memory for the signature of IDirect3DDevice9::EndScene, or any other DirectX function.
  • Create a dummy IDirect3DDevice9 instance, read its virtual table, find the address of EndScene, and hook at the target site.
  • Look for the CD3DBase::EndScene symbol in d3d9.dll and get its address.

Each one has its drawbacks, but my personal preference is the last option. It’s the one that offers the greatest reliability for the least amount of overhead code. The code for it is pretty straightforward, with the help of the Windows debugging APIs:

const DWORD_PTR GetAddressFromSymbols()
{
    BOOL success = SymInitialize(GetCurrentProcess(), nullptr, true);
    if (!success)
    {
        fprintf(stderr, "Could not load symbols for process.\n");
        return 0;
    }
 
    SYMBOL_INFO symInfo = { 0 };
    symInfo.SizeOfStruct = sizeof(SYMBOL_INFO);
 
    success = SymFromName(GetCurrentProcess(), "d3d9!CD3DBase::EndScene", &symInfo);
    if (!success)
    {
        fprintf(stderr, "Could not get symbol address.\n");
        return 0;
    }
 
    return (DWORD_PTR)symInfo.Address;
}

Once the address is retrieved, it’s simply a matter of installing the hook and writing code in the new hook function. The Hekate engine was used for hook installation/removal, making the code simple:

const bool Hook(const DWORD_PTR address, const DWORD_PTR hookAddress)
{
    pHook = std::unique_ptr<Hekate::Hook::InlineHook>(new Hekate::Hook::InlineHook(address, hookAddress));
 
    if (!pHook->Install())
    {
        fprintf(stderr, "Could not hook address 0x%X -> 0x%X\n", address, hookAddress);
    }
 
    return pHook->IsHooked();
}

The EndScene function was chosen specifically due to how DirectX9 applications are developed. For those unfamiliar with DirectX, the flow of rendering a scene generally goes as follows: BeginScene -> Draw the scene -> EndScene -> Present. Other DirectX9 hook implementations hook Present instead of EndScene, it becomes a matter of preference unless the target application does something special. In the example application, some text is overlaid on top of the scene:

HRESULT WINAPI EndSceneHook(void *pDevicePtr)
{
    using pFncOriginalEndScene = HRESULT (WINAPI *)(void *pDevicePtr);
    pFncOriginalEndScene EndSceneTrampoline =
        (pFncOriginalEndScene)pHook->TrampolineAddress();
 
    IDirect3DDevice9 *pDevice = (IDirect3DDevice9 *)pDevicePtr;
    ID3DXFont *pFont = nullptr;
 
    HRESULT result = D3DXCreateFont(pDevice, 30, 0, FW_NORMAL, 1, false,
        DEFAULT_CHARSET, OUT_DEFAULT_PRECIS, ANTIALIASED_QUALITY,
        DEFAULT_PITCH | FF_DONTCARE, L"Consolas", &pFont);
    if (FAILED(result))
    {
        fprintf(stderr, "Could not create font. Error = 0x%X\n", result);
    }
    else
    {
        RECT rect = { 0 };
        (void)SetRect(&rect, 0, 0, 300, 100);
        int height = pFont->DrawText(nullptr, L"Hello, World!", -1, &rect,
            DT_LEFT | DT_NOCLIP, -1);
        if (height == 0)
        {
            fprintf(stderr, "Could not draw text.\n");
        }
        (void)pFont->Release();
    }
 
    return EndSceneTrampoline(pDevicePtr);
}

Building as a DLL and injecting into the running application should show the text overlay (below):

sampleimgdx9

Hekate supports clean unhooking, so unloading the DLL should remove the text and let the application continue undisturbed.

Code

The Visual Studio 2015 project for this example can be found here. The source code is viewable on Github here. The Hekate static library dependency is included in a separate download here and goes into the DirectXHook/lib folder. Capstone Engine is used as a runtime dependency, so capstone_x86.dll/capstone_x64.dll in DirectXHook/thirdparty/capstone/lib should be put in the same directory that the target application is running from.

Thanks for reading and follow on Twitter for more updates