RCE Endeavors 😅

January 15, 2015

Virtual Method Table (VMT) Hooking

Filed under: Game Hacking,General x86,General x86-64,Programming — admin @ 1:39 PM

This post will cover the topic of hooking a classes’ virtual method table. This is a useful technique that has many applications, but is most commonly seen in developing game hacks. For example, employing VMT hooking of objects in a Direct3D/OpenGL graphics engine is how in-game overlays are displayed.

Virtual Method Tables (or vtables)

Usage of VMTs, in the context of C++ for this post, is how polymorphism is implemented at the language level. Internally, the VMT is represented as an array of function pointers, and typically resides at the beginning or end of the memory layout of the object. Whenever a C++ class declares a virtual function, the compiler will add an entry in to the VMT for it. If a class inherits from a base object and overrides a base virtual function, then the pointer to the overriden function will be present in the derived objects VMT. For example, take the following code, compiled with the VS 2013 compiler on an x86 system:

class Base
{
public:
    Base() { printf("-  Base::Base\n"); }
    virtual ~Base() { printf("-  Base::~Base\n"); }
 
    void A() { printf("-  Base::A\n"); }
    virtual void B() { printf("-  Base::B\n"); }
    virtual void C() { printf("-  Base::C\n"); }
};
 
class Derived final : public Base
{
public:
    Derived() { printf("-  Derived::Derived\n"); }
    ~Derived() { printf("-  Derived::~Derived\n"); }
 
    void B() override { printf("-  Derived::B\n"); }
    void C() override { printf("-  Derived::C\n"); }
};

with the instances of Base and Derived created as follows:

Base base;
Derived derived;
Base *pBase = new Derived;

The class Base has three virtual functions: ~Base, B, and C. The class Derived, which inherits from Base overrides the two virtual functions B and C. In memory, the VMT for Base will contain ~Base, B, and C, as can be inspected with the debugger:

vt1while the VMT for the two Derived instances contain ~Derived, B, and C, but with different addresses for each than the ones in Base (see below).

vt3
vt2So how are these actually used? Take, for example, a function that takes a pointer to a Base instance and invokes the functions A, B, and C, on it:

void Invoke(Base * const pBase)
{
    pBase->A();
    pBase->B();
    pBase->C();
}

and is invoked in the following manner:

    Invoke(&base);
    Invoke(&derived);
    Invoke(pBase);

The Invoke function disassembled for x86 is as follows:

    pBase->A();
004012C9 8B 4D 08             mov         ecx,dword ptr [pBase]  
004012CC E8 8F FE FF FF       call        Base::A (0401160h)  
    pBase->B();
004012D1 8B 45 08             mov         eax,dword ptr [pBase]  
004012D4 8B 10                mov         edx,dword ptr [eax]  
004012D6 8B 4D 08             mov         ecx,dword ptr [pBase]  
004012D9 8B 42 04             mov         eax,dword ptr [edx+4]  
004012DC FF D0                call        eax  
    pBase->C();
004012DE 8B 45 08             mov         eax,dword ptr [pBase]  
004012E1 8B 10                mov         edx,dword ptr [eax]  
004012E3 8B 4D 08             mov         ecx,dword ptr [pBase]  
004012E6 8B 42 08             mov         eax,dword ptr [edx+8]  
004012E9 FF D0                call        eax  

This disassembly shows exactly what is going on under the hood with relation to polymorphism. For the invocations to B and C, the compiler moves the address of the object in to the EAX register. This is then dereferenced to get the base of the VMT and stored in the EDX register. The appropriate VMT entry for the function is found by using EDX as an index and storing the address in EAX. This function is then called. Since Base and Derived have different VMTs, this code will call different functions — the appropriate ones — for the appropriate object type. Seeing how it’s done under the hood also allows us to easily write a function to print the VMT.

void PrintVTable(Base * const pBase)
{
    unsigned int *pVTableBase = (unsigned int *)(*(unsigned int *)pBase);
    printf("First: %p\n"
        "Second: %p\n"
        "Third: %p\n",
        *pVTableBase, *(pVTableBase + 1), *(pVTableBase + 2));
}

Hooking the VMT

Knowing the layout of the VMT makes it trivial to hook. To accomplish this, all that is needed is to overwrite the entry in the VMT with the address of the desired hook function. This is done by using the VirtualProtect function to set the appropriate memory permissions alongside with memcpy to write in the desired hook address. Note that memcpy is used since everything resides within the same address space, otherwise WriteProcessMemory would have to be used. A hooking routine might look like the following:

void HookVMT(Base * const pBase)
{
    unsigned int *pVTableBase = (unsigned int *)(*(unsigned int *)pBase);
    unsigned int *pVTableFnc = (unsigned int *)((pVTableBase + 1));
    void *pHookFnc = (void *)VMTHookFnc;
 
    SIZE_T ulOldProtect = 0;
    (void)VirtualProtect(pVTableFnc, sizeof(void *), PAGE_EXECUTE_READWRITE, &ulOldProtect);
    memcpy(pVTableFnc, &pHookFnc, sizeof(void *));
    (void)VirtualProtect(pVTableFnc, sizeof(void *), ulOldProtect, &ulOldProtect);
}

and VMTHook having a simple definition of

void __fastcall VMTHookFnc(void *pEcx, void *pEdx)
{
    Base *pThisPtr = (Base *)pEcx;
 
    printf("In VMTHookFnc\n");
}

Here the fastcall calling convention is used to easily retrieve the this pointer, which is typically stored in the ECX register.

Applications

The application of this technique will show how to hook IDXGISwapChain::Present and allow for rendering/overlaying of text on a Direct3D10 application. This is not the only way to overlay text, nor necessarily the best, but still provides an adequate example to illustrate the point. The target application will be a Direct3D10 sample provided by the June 2010 DirectX SDK. See /Samples/C++/Direct3D10/Tutorials/Tutorial01 in the SDK. The sample application initializes the Direct3D device and swap chain with a call to D3D10CreateDeviceAndSwapChain then simply sets up a view and renders a blue background on the window (screenshot below).screen1

To overlay text on a Direct3D application, the IDXGISwapChain object must be obtained. Then the Present function of the interface must be hooked, since that is the function responsible for showing the rendered image to the user. This is done here by hooking D3D10CreateDeviceAndSwapChain. Once this function is hooked, the hook will call the real D3D10CreateDeviceAndSwapChain function in order to set up the IDXGISwapChain interface. Then the VMT entry for Present will be replaced with a hooked version that renders text. Put into code it looks like the following:

HRESULT WINAPI D3D10CreateDeviceAndSwapChainHook(IDXGIAdapter *pAdapter, D3D10_DRIVER_TYPE DriverType, HMODULE Software,
    UINT Flags, UINT SDKVersion, DXGI_SWAP_CHAIN_DESC *pSwapChainDesc, IDXGISwapChain **ppSwapChain,
    ID3D10Device **ppDevice)
{
 
    printf("In D3D10CreateDeviceAndSwapChainHook\n");
 
    //Create the device and swap chain
    HRESULT hResult = pD3D10CreateDeviceAndSwapChain(pAdapter, DriverType, Software, Flags, SDKVersion,
        pSwapChainDesc, ppSwapChain, ppDevice);
 
    //Save the device and swap chain interface.
    //These aren't used in this example but are generally nice to have addresses to
    if(ppSwapChain == NULL)
    {
        printf("Swap chain is NULL.\n");
        return hResult;
    }
    else
    {
        pSwapChain = *ppSwapChain;
    }
    if(ppDevice == NULL)
    { 
        printf("Device is NULL.\n");
        return hResult;
    }
    else
    {
        pDevice = *ppDevice;
    }
 
    //Get the vtable address of the swap chain's Present function and modify it with our own.
    //Save it to return to later in our Present hook
    if(pSwapChain != NULL)
    {
        DWORD_PTR *SwapChainVTable = (DWORD_PTR *)pSwapChain;
        SwapChainVTable = (DWORD_PTR *)SwapChainVTable[0];
        printf("Swap chain VTable: %X\n", SwapChainVTable);
        PresentAddress = (pPresent)SwapChainVTable[8];
        printf("Present address: %X\n", PresentAddress);
 
        DWORD OldProtections = 0;
        VirtualProtect(&SwapChainVTable[8], sizeof(DWORD_PTR), PAGE_EXECUTE_READWRITE, &OldProtections);
        SwapChainVTable[8] = (DWORD_PTR)PresentHook;
        VirtualProtect(&SwapChainVTable[8], sizeof(DWORD_PTR), OldProtections, &OldProtections);
    }
 
    //Create the font that we will be drawing with
    CreateDrawingFont();
 
    return hResult;
}

CreateDrawingFont simply sets up a ID3DX10Font to draw with. Now since the VMT entry was replaced, PresentHook will be invoked instead of Present. Here is where the drawing can be done.

HRESULT WINAPI PresentHook(IDXGISwapChain *thisAddr, UINT SyncInterval, UINT Flags)
{
 
    //printf("In Present (%X)\n", PresentAddress);
 
    RECT Rect = { 100, 100, 200, 200 };
    pFont->DrawTextW(NULL, L"Hello, World!", -1, &Rect, DT_CENTER | DT_NOCLIP, RED);
    return PresentAddress(thisAddr, SyncInterval, Flags);
}

I chose a different calling convention here than for the earlier example code, but everything still functions the same. The end result shows the Present hook successfully rendering the text:screen2
A few important caveats about doing it this way:

  • The hook must be installed prior to the call to D3D10CreateDeviceAndSwapChain. Otherwise handles to the device and swap chain won’t be obtained.
  • ID3DX10Font::DrawText can mess with the blend states, shaders, rasterizer, etc. Overlaying text on an application that makes use of these requires the hook developer to account for this and save/restore the states properly.

The source code for the VMT hook example, the slightly modified Direct3D10 sample application, and the Direct3D10 hook can be found here. The hook uses Microsoft Detours as a dependency to perform the initial hooking of D3D10CreateDeviceAndSwapChain.

May 2, 2014

Messing with MSN Internet Games (2/2)

Filed under: Game Hacking,General x86-64,Reverse Engineering — admin @ 2:35 PM

spades
The not-too-long-awaited followup continues. This post will outline some of the internals of how the common network code residing in zgmprxy.dll works. This DLL is shared across Internet Checkers, Internet Backgammon, and Internet Spades to carry out all of the network functionality. Fortunately, or rather unfortunately from a challenge perspective, Microsoft has provided debugging symbols for zgmprxy.dll. This removes some of the challenge in finding interesting functions, but does still allow for some decent reverse engineering knowledge to actually understand how everything is working.

Starting Point

The obvious starting point for this is to load and look through the zgmproxy.pdb file provided through the Microsoft Symbol Server. There are tons of good functions to look through, but for the sake of brevity, I will be focusing on four of them here.

?BeginConnect@CStadiumSocket@@QEAAJQEAGK@Z
?SendData@CStadiumSocket@@QEAAHPEADIHH@Z
?DecryptSocketData@CStadiumSocket@@AEAAJXZ
?Disconnect@CStadiumSocket@@QEAAXXZ

Understanding how name decorations work allows for a recovery of a large amount of information, such as parameter number any types, function name and class membership information, calling convention (__thiscall for this case obviously, although I treat it as __stdcall with the “this” pointer as the first parameter in the example code), etc.

The Plan

The plan here does not change too much from what happened in the previous post:

  • Get into the address space of the target executable. Nothing here changes from last post.
  • Get the addresses of the above functions. This becomes very simple with the debug/symbol APIs provided by the WinAPI.
  • Install hooks at desired places on the functions.
  • Save off the CStadiumSocket instance so we can call functions in it at our own leisure. As an example for this post, it will be to send custom chat messages instead of the pre-selected ones offered by the games.

DllMain does not change drastically from the last revision.

int APIENTRY DllMain(HMODULE hModule, DWORD dwReason, LPVOID lpReserved)
{
        switch(dwReason)
    {
    case DLL_PROCESS_ATTACH:
        (void)DisableThreadLibraryCalls(hModule);
        if(AllocConsole())
        {
            freopen("CONOUT$", "w", stdout);
            SetConsoleTitle(L"Console");
            SetConsoleTextAttribute(GetStdHandle(STD_OUTPUT_HANDLE), FOREGROUND_RED | FOREGROUND_GREEN | FOREGROUND_BLUE);
            printf("DLL loaded.\n");
        }
        if(GetFunctions())
        {
            pExceptionHandler = AddVectoredExceptionHandler(TRUE, VectoredHandler);
            if(SetBreakpoints())
            {
                if(CreateThread(NULL, 0, DlgThread, hModule, 0, NULL) == NULL)
                    printf("Could not create dialog thread. Last error = %X\n", GetLastError());
            }
            else
            {
                printf("Could not set initial breakpoints.\n");
            }
            printf("CStadiumSocket::BeginConnect: %016X\n"
                "CStadiumSocket::SendData: %016X\n"
                "CStadiumSocket::DecryptSocketData: %016X\n"
                "CStadiumSocket::Disconnect: %016X\n",
                BeginConnectFnc, SendDataFnc, DecryptSocketDataFnc, DisconnectFnc);
        }
        break;
 
    case DLL_PROCESS_DETACH:
        //Clean up here usually
        break;
 
    case DLL_THREAD_ATTACH:
        break;
 
    case DLL_THREAD_DETACH:
        break;
    }
 
    return TRUE;
}

There are four functions now as well as a new thread which will hold a dialog to enter custom chat (discussed later). Memory breakpoints are still used and nothing has changed about how they are added. GetFunctions() has drastically changed in this revision. Instead of finding the target functions through GetProcAddress, the injected DLL can load up symbols at runtime and find the four desired functions through the use of the SymGetSymFromName64 function.

const bool GetFunctions(void)
{
    (void)SymSetOptions(SYMOPT_UNDNAME);
    if(SymInitialize(GetCurrentProcess(), "", TRUE))
    {
        IMAGEHLP_SYMBOL64 imageHelp = { 0 };
        imageHelp.SizeOfStruct = sizeof(IMAGEHLP_SYMBOL64);
 
        (void)SymGetSymFromName64(GetCurrentProcess(), "CStadiumSocket::BeginConnect", &imageHelp);
        BeginConnectFnc = (pBeginConnect)imageHelp.Address;
 
        (void)SymGetSymFromName64(GetCurrentProcess(), "CStadiumSocket::SendData", &imageHelp);
        SendDataFnc = (pSendData)imageHelp.Address;
 
        (void)SymGetSymFromName64(GetCurrentProcess(), "CStadiumSocket::DecryptSocketData", &imageHelp);
        DecryptSocketDataFnc = (pDecryptSocketData)imageHelp.Address;  
 
        (void)SymGetSymFromName64(GetCurrentProcess(), "CStadiumSocket::Disconnect", &imageHelp);
        DisconnectFnc = (pDisconnect)imageHelp.Address;
 
    }
    else
    {
        printf("Could not initialize symbols. Last error = %X", GetLastError());
    }
    return ((BeginConnectFnc != NULL) && (SendDataFnc != NULL)
        && (DecryptSocketDataFnc != NULL) && (DisconnectFnc != NULL));
}

Here symbols will be loaded with undecorated names and the target functions will be retrieved. The zgmprxy.pdb file must reside in one of the directories that SymInitialize checks, namely in one of the following:

    The current working directory of the application
    The _NT_SYMBOL_PATH environment variable
    The _NT_ALTERNATE_SYMBOL_PATH environment variable

That is really all there is in terms of large changes from last post, so it’s time to begin actually reversing these four functions.

?BeginConnect@CStadiumSocket@@QEAAJQEAGK@Z

As the function name implies, this is called to begin a connection with the matchmaking service and game. The control flow graph looks pretty straightforward, as is the functionality of BeginConnect.

msncfgFrom a cursory inspection, the function appears to be a wrapper around QueueUserWorkItem. It takes a URL and port number as input, and is responsible for initializing and formatting them in a way before launching an asynchronous task. My x64 -> C interpretation yields something similar to the following (x64 code in comment form, my C translation below). Allocation sizes were retrieved during a trace and don’t necessarily fully reflect the logic:

int CStadiumSocket::BeginConnect(wchar_t *pUrl, unsigned long ulPortNumber)
{
//.text:000007FF34FB24C7                 mov     rcx, r12        ; size_t
//.text:000007FF34FB24CA                 call    ??_U@YAPEAX_K@Z ; operator new[](unsigned __int64)
//.text:000007FF34FB24CF                 mov     rsi, rax
//.text:000007FF34FB24D2                 cmp     rax, rbx
//.text:000007FF34FB24D5                 jnz     short loc_7FF34FB24E1
    wchar_t *strPortNum = new wchar_t[32];
    if(strPortNum == NULL)
        return 0x800404DB;
 
//.text:000007FF34FB24E1                 mov     r8, r12         ; size_t
//.text:000007FF34FB24E4                 xor     edx, edx        ; int
//.text:000007FF34FB24E6                 mov     rcx, rax        ; void *
//.text:000007FF34FB24E9                 call    memset
    memset(pBuffer, 0, 32 * sizeof(wchar_t));
 
//.text:000007FF34FB24EE                 lea     r12, [rbp+3Ch]
//.text:000007FF34FB24F2                 mov     r11d, 401h
//.text:000007FF34FB24F8                 mov     rax, r12
//.text:000007FF34FB24FB                 sub     rdi, r12
//.text:000007FF34FB24FE
//.text:000007FF34FB24FE loc_7FF34FB24FE:                        ; CODE XREF: CStadiumSocket::BeginConnect(ushort * const,ulong)+77j
//.text:000007FF34FB24FE                 cmp     r11, rbx
//.text:000007FF34FB2501                 jz      short loc_7FF34FB251E
//.text:000007FF34FB2503                 movzx   ecx, word ptr [rdi+rax]
//.text:000007FF34FB2507                 cmp     cx, bx
//.text:000007FF34FB250A                 jz      short loc_7FF34FB2519
//.text:000007FF34FB250C                 mov     [rax], cx
//.text:000007FF34FB250F                 add     rax, 2
//.text:000007FF34FB2513                 sub     r11, 1
//.text:000007FF34FB2517                 jnz     short loc_7FF34FB24FE
//.text:000007FF34FB2519
//.text:000007FF34FB2519 loc_7FF34FB2519:                        ; CODE XREF: CStadiumSocket::BeginConnect(ushort * const,ulong)+6Aj
//.text:000007FF34FB2519                 cmp     r11, rbx
//.text:000007FF34FB251C                 jnz     short loc_7FF34FB2522
//.text:000007FF34FB251E
//.text:000007FF34FB251E loc_7FF34FB251E:                        ; CODE XREF: CStadiumSocket::BeginConnect(ushort * const,ulong)+61j
//.text:000007FF34FB251E                 sub     rax, 2 
    for(unsigned int i = 0; i < 1025; ++i)
    {
        m_pBuffer[i] = pUrl[i];
        if(pBuffer[i] == 0)
            break;
    }
 
//.text:000007FF34FB2522                 mov     r9d, 0Ah        ; int
//.text:000007FF34FB2528                 mov     rdx, rsi        ; wchar_t *
//.text:000007FF34FB252B                 mov     ecx, r13d       ; int
//.text:000007FF34FB252E                 lea     r8d, [r9+16h]   ; size_t
//.text:000007FF34FB2532                 mov     [rax], bx
//.text:000007FF34FB2535                 call    _itow_s
    (void)_itow_s(ulPortNumber, strPortNum, 32, 10);
 
//.text:000007FF34FB253A                 mov     [rbp+38h], r13d
//.text:000007FF34FB253E                 mov     r13d, 30h
//.text:000007FF34FB2544                 lea     rcx, [rsp+68h+var_48] ; void *
//.text:000007FF34FB2549                 mov     r8, r13         ; size_t
//.text:000007FF34FB254C                 xor     edx, edx        ; int
//.text:000007FF34FB254E                 mov     [rbp+85Ch], ebx
//.text:000007FF34FB2554                 call    memset
    char partialContextBuffer[48];
    memset(str, 0, sizeof(str));
 
//.text:000007FF34FB2559                 lea     ecx, [r13+28h]  ; size_t
//.text:000007FF34FB255D                 mov     [rsp+68h+var_44], ebx
//.text:000007FF34FB2561                 mov     [rsp+68h+var_40], 1
//.text:000007FF34FB2569                 call    ??2@YAPEAX_K@Z  ; operator new(unsigned __int64)
//.text:000007FF34FB256E                 mov     rdi, rax
//.text:000007FF34FB2571                 cmp     rax, rbx
//.text:000007FF34FB2574                 jz      short loc_7FF34FB257E
//.text:000007FF34FB2576                 mov     dword ptr [rax], 1
//.text:000007FF34FB257C                 jmp     short loc_7FF34FB2581
    char *pContextBuffer = new char[88]; 
    if(pContextBuffer == NULL)
        return 0x800404DB;
 
//.text:000007FF34FB2586                 lea     rcx, [rdi+18h]  ; void *
//.text:000007FF34FB258A                 lea     rdx, [rsp+68h+var_48] ; void *
//.text:000007FF34FB258F                 mov     r8, r13         ; size_t
//.text:000007FF34FB2592                 mov     [rdi+8], r12
//.text:000007FF34FB2596                 mov     [rdi+10h], rsi
//.text:000007FF34FB259A                 call    memmove
    *(pContextBuffer) = 1; //At 000007FF34FB2576
    *(pContextBuffer + 8) = &m_pBuffer;
    *(pContextBuffer + 16) = &strPortNum;
    memmove(&pContextBuffer[24], partialContextBuffer, 48);
 
//.text:000007FF34FB259F                 lea     r11, [rbp+0A80h]
//.text:000007FF34FB25A6                 lea     rax, [rbp+18h]
//.text:000007FF34FB25AA                 lea     rcx, ?AsyncGetAddrInfoW@CStadiumSocket@@SAKPEAX@Z ; Function
//.text:000007FF34FB25B1                 xor     r8d, r8d        ; Flags
//.text:000007FF34FB25B4                 mov     rdx, rdi        ; Context
//.text:000007FF34FB25B7                 mov     [rdi+48h], r11
//.text:000007FF34FB25BB                 mov     [rdi+50h], rax
//.text:000007FF34FB25BF                 call    cs:__imp_QueueUserWorkItem
//.text:000007FF34FB25C5                 cmp     eax, ebx
//.text:000007FF34FB25C7                 jnz     short loc_7FF34FB25D5
//.text:000007FF34FB25C9                 mov     ebx, 800404BFh
//.text:000007FF34FB25CE                 jmp     short loc_7FF34FB25D5
    if(QueueUserWorkItem(&AsyncGetAddrInfo, pContextBuffer, 0) == FALSE)
        return 0x800404BF;
 
//From success case
    return 0;
}

?SendData@CStadiumSocket@@QEAAHPEADIHH@Z

The next function to look at is the SendData function. This function formats the data to send and invokes OnASyncDataWrite to write it out. The function creates a buffer of max length 0x4010 (16400) bytes, copies in the message buffer, and appends a few fields to the end. There is some handling code in the event that the message is of a handshake type, or if it is a message that is to be queued up. Below is a mostly complete translation of the assembly.

int CStadiumSocket::SendData(char *pBuffer, unsigned int uiLength, bool bIsHandshake, bool bLastHandshake)
{
//.text : 000007FF34FB350C                 cmp     dword ptr[rcx + 0A88h], 0
//.text : 000007FF34FB3513                 mov     rax, [rcx + 840h]
//.text : 000007FF34FB351A                 mov     r13, rdx
//.text : 000007FF34FB351D                 mov     rax, [rax + 10h]
//.text : 000007FF34FB3521                 lea     rdx, aTrue; "true"
//.text : 000007FF34FB3528                 mov     rdi, rcx
//.text : 000007FF34FB352B                 mov[rsp + 58h + var_20], rax
//.text : 000007FF34FB3530                 lea     r11, aFalse; "false"
//.text : 000007FF34FB3537                 mov     ebp, r8d
//.text : 000007FF34FB353A                 mov     r10, r11
//.text : 000007FF34FB353D                 mov     rcx, r11
//.text : 000007FF34FB3540                 mov     r12d, r9d
//.text : 000007FF34FB3543                 cmovnz  r10, rdx
//.text : 000007FF34FB3547                 cmp[rsp + 58h + arg_20], 0
//.text : 000007FF34FB354F                 cmovnz  rcx, rdx
//.text : 000007FF34FB3553                 test    r9d, r9d
//.text : 000007FF34FB3556                 mov[rsp + 58h + var_28], r10
//.text : 000007FF34FB355B                 mov[rsp + 58h + var_30], rcx
//.text : 000007FF34FB3560                 cmovnz  r11, rdx
//.text : 000007FF34FB3564                 mov     r9d, r8d
//.text : 000007FF34FB3567                 lea     rcx, aCstadiumsoc_15; "CStadiumSocket::SendData:\n    BUFFER:  "...
//.text : 000007FF34FB356E                 mov     r8, r13
//.text : 000007FF34FB3571                 mov     edx, ebp
//.text : 000007FF34FB3573                 mov[rsp + 58h + var_38], r11
//.text : 000007FF34FB3578                 call ? SafeDbgLog@@YAXPEBGZZ; SafeDbgLog(ushort const *, ...)
    QueueNode *pQueueNode = m_msgQueue;
 
    char *strIsHandshake = (bIsHandshake == 0) ? "true" : "false";
    char *strPostHandshake = (m_bPostHandshake == 0) ? "true" : "false";
    char *strLastHandshake = (bLastHandshake == 0) ? "true" : "false";
 
    SafeDbgLog("CStadiumSocket::SendData:    BUFFER:    \"%*.S\"    LENGTH:    %u    HANDSHAKE: %s    LAST HS:   %s    POST HS:   %s    Queue:     %u",
        uiLength, pBuffer, uiLength, strIsHandshake, strLastHandshake, strPostHandshake, pQueueNode.Count);
 
//.text : 000007FF34FB357D                 mov     ecx, 4010h; size_t
//.text : 000007FF34FB3582                 call ? ? 2@YAPEAX_K@Z; operator new(unsigned __int64)
//.text : 000007FF34FB3587                 mov     rsi, rax
//.text : 000007FF34FB358A                 mov[rsp + 58h + arg_0], rax
//.text : 000007FF34FB358F                 test    rax, rax
//.text : 000007FF34FB3592                 jz      loc_7FF34FB36B3
//.text : 000007FF34FB3598                 mov     ebx, 4000h
//.text : 000007FF34FB359D                 xor     edx, edx; int
//.text : 000007FF34FB359F                 mov     rcx, rax; void *
//.text : 000007FF34FB35A2                 mov     r8, rbx; size_t
//.text : 000007FF34FB35A5                 call    memset
//.text : 000007FF34FB35AA                 cmp     ebp, ebx
//.text : 000007FF34FB35AC                 mov     rdx, r13; void *
//.text : 000007FF34FB35AF                 cmovb   rbx, rbp
//.text : 000007FF34FB35B3                 mov     rcx, rsi; void *
//.text : 000007FF34FB35B6                 mov     r8, rbx; size_t
//.text : 000007FF34FB35B9                 call    memmove
//.text : 000007FF34FB35BE                 and     dword ptr[rsi + 4000h], 0
//.text : 000007FF34FB35C5                 mov[rsi + 4004h], ebp
//.text : 000007FF34FB35CB                 mov[rsi + 4008h], r12d
//.text : 000007FF34FB35D2                 and     dword ptr[rsi + 400Ch], 0
    char *pFullBuffer = new char[0x4010];
    if(pFullBuffer == NULL)
    {
        return 0;
    }
 
    memset(pFullBuffer, 0, 0x4000);
 
    uiLength = (uiLength < 0x4000) ? uiLength : 0x4000;
    memmove(pFullBuffer, pBuffer, uiLength);
 
    pFullBuffer[0x4000] = 0;
    pFullBuffer[0x4004] = uiLength;
    pFullBuffer[0x4008] = bPostHandshake;
    pFullBuffer[0x400C] = 0;
 
//.text : 000007FF34FB35D9                 test    r12d, r12d
//.text : 000007FF34FB35DC                 jz      short loc_7FF34FB3658
//.text : 000007FF34FB35DE                 mov     rax, [rdi + 840h]
//.text : 000007FF34FB35E5                 mov     rbx, [rax]
//.text : 000007FF34FB35E8                 test    rbx, rbx
//.text : 000007FF34FB35EB
//.text : 000007FF34FB35EB loc_7FF34FB35EB : ; CODE XREF : CStadiumSocket::SendData(char *, uint, int, int) + 119j
//.text : 000007FF34FB35EB                 jz      short loc_7FF34FB364F
//...
//.text : 000007FF34FB364F loc_7FF34FB364F : ; CODE XREF : CStadiumSocket::SendData(char *, uint, int, int) : loc_7FF34FB35EBj
//.text : 000007FF34FB364F                 lea     rcx, aCstadiumsoc_18; "CStadiumSocket::SendData: AddTail in se"...
//.text : 000007FF34FB3656                 jmp     short loc_7FF34FB365F
//.text : 000007FF34FB3658; -------------------------------------------------------------------------- -
//.text : 000007FF34FB3658
//.text : 000007FF34FB3658 loc_7FF34FB3658 : ; CODE XREF : CStadiumSocket::SendData(char *, uint, int, int) + E8j
//.text : 000007FF34FB3658                 lea     rcx, aCstadiumsock_9; "CStadiumSocket::SendData: AddTail\n\n"
//.text : 000007FF34FB365F
//.text : 000007FF34FB365F loc_7FF34FB365F : ; CODE XREF : CStadiumSocket::SendData(char *, uint, int, int) + 162j
//.text : 000007FF34FB365F                 call ? SafeDbgLog@@YAXPEBGZZ; SafeDbgLog(ushort const *, ...)
    bool bAddTail = (!bPostHandshake || pQueueNode->Prev == NULL);
    if(!bPostHandshake)
    {
        SafeDbgLog("CStadiumSocket::SendData: AddTail\n\n");
    }
    else if(pQueueNode->Prev == NULL)
    {
        SafeDbgLog("CStadiumSocket::SendData: AddTail in search.");
    }
 
//.text : 000007FF34FB3664                 mov     rbx, [rdi + 840h]
//.text : 000007FF34FB366B                 lea     rdx, [rsp + 58h + arg_0]
//.text : 000007FF34FB3670                 mov     r8, [rbx + 8]
//.text : 000007FF34FB3674                 xor     r9d, r9d
//.text : 000007FF34FB3677                 mov     rcx, rbx
//.text : 000007FF34FB367A                 call ? NewNode@ 
//.text : 000007FF34FB367F                 mov     rcx, [rbx + 8]
//.text : 000007FF34FB3683                 test    rcx, rcx
//.text : 000007FF34FB3686                 jz      short loc_7FF34FB368D
//.text : 000007FF34FB3688 loc_7FF34FB3688 : ; CODE XREF : CStadiumSocket::SendData(char *, uint, int, int) + 149j
//.text : 000007FF34FB3688                 mov[rcx], rax
//.text : 000007FF34FB368B                 jmp     short loc_7FF34FB3690
//.text : 000007FF34FB368D; -------------------------------------------------------------------------- -
//.text : 000007FF34FB368D
//.text : 000007FF34FB368D loc_7FF34FB368D : ; CODE XREF : CStadiumSocket::SendData(char *, uint, int, int) + 192j
//.text : 000007FF34FB368D
    if(bAddTail)
    {
        QueueNode *pNewNode = ATL::CAtlList::NewNode(pQueueNode->Top, pQueueNode->Prev, pQueueNode->Next);
        if(pQueueNode->Next == NULL)
        {
            pQueueNode->Next = pNewNode;
        }
        else
        {
            pQueueNode = pNewNode;
        }        
    }
 
//.text : 000007FF34FB3690                 cmp[rsp + 58h + arg_20], 0
//.text : 000007FF34FB3698                 mov[rbx + 8], rax
//.text : 000007FF34FB369C                 mov     ebx, 1
//.text : 000007FF34FB36A1                 jz      short loc_7FF34FB36A9
//.text : 000007FF34FB36A3                 mov[rdi + 0A88h], ebx
//.text : 000007FF34FB36A9
//.text : 000007FF34FB36A9 loc_7FF34FB36A9 : ; CODE XREF : CStadiumSocket::SendData(char *, uint, int, int) + 1ADj
//.text : 000007FF34FB36A9                 mov     rcx, rdi
//.text : 000007FF34FB36AC                 call ? OnAsyncDataWrite@CStadiumSocket@@AEAAXXZ; CStadiumSocket::OnAsyncDataWrite(void)
        pQueueNode->Next = pQueueNode;
        m_bPostHandshake = bLastHandshake;
        OnASyncDataWrite();
    }
 
//.text : 000007FF34FB35EB                 jz      short loc_7FF34FB364F
//.text : 000007FF34FB35ED                 test    rbx, rbx
//.text : 000007FF34FB35F0                 jz      short loc_7FF34FB3644
//.text : 000007FF34FB35F2                 mov     rcx, [rbx + 10h]
//.text : 000007FF34FB35F6                 mov     rax, [rbx]
//.text : 000007FF34FB35F9                 test    rcx, rcx
//.text : 000007FF34FB35FC                 jz      short loc_7FF34FB3607
//.text : 000007FF34FB35FE                 cmp     dword ptr[rcx + 4008h], 0
//.text : 000007FF34FB3605                 jz      short loc_7FF34FB360F
//.text : 000007FF34FB3607
//.text : 000007FF34FB3607 loc_7FF34FB3607 : ; CODE XREF : CStadiumSocket::SendData(char *, uint, int, int) + 108j
//.text : 000007FF34FB3607                 mov     rbx, rax
//.text : 000007FF34FB360A                 test    rax, rax
//.text : 000007FF34FB360D                 jmp     short loc_7FF34FB35EB
//.text : 000007FF34FB360F; -------------------------------------------------------------------------- -
//.text : 000007FF34FB360F
//.text : 000007FF34FB360F loc_7FF34FB360F : ; CODE XREF : CStadiumSocket::SendData(char *, uint, int, int) + 111j
//.text : 000007FF34FB360F                 lea     rcx, aCstadiumsoc_28; "CStadiumSocket::SendData: InsertBefore "...
//.text : 000007FF34FB3616                 call ? SafeDbgLog@@YAXPEBGZZ; SafeDbgLog(ushort const *, ...)
    else if(bPostHandshake)
    {
        pQueueNode *pNodePtr = pQueueNode;
        while(pNodePtr->Next != NULL)
        {
            pNodePtr = pNodePtr->Next;
            if(pNodePtr.pData[0x4008] == 0)
            {
                break;
            } 
        }
        SafeDbgLog("CStadiumSocket::SendData: InsertBefore in search.");
 
//.text : 000007FF34FB361B                 mov     rsi, [rdi + 840h]
//.text : 000007FF34FB3622                 mov     r8, [rbx + 8]
//.text : 000007FF34FB3626                 lea     rdx, [rsp + 58h + arg_0]
//.text : 000007FF34FB362B                 mov     rcx, rsi
//.text : 000007FF34FB362E                 mov     r9, rbx
//.text : 000007FF34FB3631                 call ? NewNode@ 
//.text : 000007FF34FB3636                 mov     rcx, [rbx + 8]
//.text : 000007FF34FB363A                 test    rcx, rcx
//.text : 000007FF34FB363D                 jnz     short loc_7FF34FB3688
//.text : 000007FF34FB363F                 mov     [rsi], rax
//.text : 000007FF34FB3642                 jmp     short loc_7FF34FB3690
        QueueNode *pNewNode = ATL::CAtlList::NewNode(pQueueNode->Top, pQueueNode->Prev, pQueueNode->Next);
        //Follows same insertion logic, except for ->Prev. Sets handshake flag again.
        OnASyncDataWrite();
    }
}

The logic looks rather complicated, but it the overall picture is that this function is responsible for scheduling of messages leaving the network and tags them with their type (handshake or not). It allocates and writes the buffer to send out and inserts it in to the message queue, which is read by OnASyncDataWrite and sent out after adding the encryption layer. Hooking this function will allow for the filtering of messages leaving the client for purposes of logging, fuzzing/modification, or other suitable purposes.

?DecryptSocketData@CStadiumSocket@@AEAAJXZ

This function is responsible for decrypting socket data after it comes in over the network from the server. In the case that the client is sending packets, CStadiumSocket::SendData is called, which in turn calls CStadiumSocket::OnASyncDataWrite; correspondingly the reverse happens in the receive case, and a CStadiumSocket::OnASyncDataRead function calls CStadiumSocket::DecryptSocketData. The internal works of this function are not necessarily important, and I will omit my x64 -> C conversion notes. The important part is to get a pointer to the buffer that has been decrypted. Doing so will allow for monitoring of messages coming from the server and like the SendData case, allows for logging or fuzzing of incoming messages to test client robustness. Doing some runtime tracing of this function, I found a good spot to pull the decrypted data from:

//.text : 000007FF34FB3D20                 movsxd  rcx, dword ptr[rdi + 400Ch]
.text : 000007FF34FB3D27                 mov     r8d, [r12]; size_t
.text : 000007FF34FB3D2B                 mov     rdx, [r12 + 8]; void *
.text : 000007FF34FB3D30                 add     rcx, rdi; void *
.text : 000007FF34FB3D33                 call    memmove

After the call to memmove, RDX will contain the decrypted buffer, with R8 containing the size. This seems like the perfect place to set the hook, at CStadiumSocket::DecryptSocketData + 0x1C3.

?DecryptSocketData@CStadiumSocket@@AEAAJXZ

The last function to look at. What happens here is also not necessarily important for our needs; looking through the assembly, it send out a “goodbye” message, what internally is referred to as a SEC_HANDSHAKE by the application, and shuts down send operations on the socket. Messages are still received and written out to the debug log (in the event that debug logging is enabled), and the socket is fully shut down and cleaned up after nothing is left to receive. This function is only hooked if we plan on doing something across multiple games in the same program instance, e.g. we resign a game and start a new one without restarting the application. Seeing this function called allows us to know that the CStadiumSocket instance captured by CStadiumSocket::BeginConnect is no longer valid for use.

Wrapping Up

Having all of this done and analyzed, changing the vectored exception handler to hook these functions (or in the middle of a function in the case of CStadiumSocket::DecryptSocketData) is just as simple as it was in the last post:

LONG CALLBACK VectoredHandler(EXCEPTION_POINTERS *pExceptionInfo)
{
    if(pExceptionInfo->ExceptionRecord->ExceptionCode == STATUS_GUARD_PAGE_VIOLATION)
    {        
        pExceptionInfo->ContextRecord->EFlags |= 0x100;
 
        DWORD_PTR dwExceptionAddress = (DWORD_PTR)pExceptionInfo->ExceptionRecord->ExceptionAddress;
        CONTEXT *pContext = pExceptionInfo->ContextRecord;
 
        if(dwExceptionAddress == (DWORD_PTR)BeginConnectFnc)
        {
            pThisPtr = (void *)pContext->Rcx;
            printf("Starting connection. CStadiumSocket instance is at: %016X\n", pThisPtr);
        }
        else if(dwExceptionAddress == (DWORD_PTR)SendDataFnc)
        {
            DWORD_PTR *pdwParametersBase = (DWORD_PTR *)(pContext->Rsp + 0x28);
            SendDataHook((void *)pContext->Rcx, (char *)pContext->Rdx, (unsigned int)pContext->R8, (int)pContext->R9, (int)(*(pdwParametersBase)));
        }
        else if(dwExceptionAddress == (DWORD_PTR)DecryptSocketDataFnc + 0x1C3)
        {
            DecryptSocketDataHook((char *)pContext->Rdx, (unsigned int)pContext->R8);
        }
        else if(dwExceptionAddress == (DWORD_PTR)DisconnectFnc)
        {
            printf("Closing connection. CStadiumSocket instance is being set to NULL\n");
            pThisPtr = NULL;
        }
 
        return EXCEPTION_CONTINUE_EXECUTION;
    }
 
    if(pExceptionInfo->ExceptionRecord->ExceptionCode == STATUS_SINGLE_STEP)
    {
        (void)SetBreakpoints();
        return EXCEPTION_CONTINUE_EXECUTION;
    }
    return EXCEPTION_CONTINUE_SEARCH;
}

To have some fun, the injected DLL can create a dialog box for chat input and send it over to the server. The game server expects a numeric value corresponding to the allowed chat in the scrollbox, but does not do any checking on it. This allows for any arbitrary message to be sent over to the server and the player on the other side will see it. The only caveat is that spaces (0x20) characters must be converted to %20. The code is as follows

INT_PTR CALLBACK DialogProc(HWND hwndDlg, UINT uMsg, WPARAM wParam, LPARAM lParam)
{
    switch(uMsg)
    {
    case WM_COMMAND:
        switch(LOWORD(wParam))
        {
            case ID_SEND:
            {
                //Possible condition here where Disconnect is called while custom chat message is being sent.
                if(pThisPtr != NULL)
                {
                    char strSendBuffer[512] = { 0 };
                    char strBuffer[256] = { 0 };
                    GetDlgItemTextA(hwndDlg, IDC_CHATTEXT, strBuffer, sizeof(strBuffer) - 1);
 
					//Extremely unsafe example code, careful...
					for (unsigned int i = 0; i < strlen(strBuffer); ++i)
					{
						if (strBuffer[i] == ' ')
						{
							memmove(&strBuffer[i + 3], &strBuffer[i + 1], strlen(&strBuffer[i]));
							strBuffer[i] = '%';
							strBuffer[i + 1] = '2';
							strBuffer[i + 2] = '0';
						}
					}
 
                    _snprintf(strSendBuffer, sizeof(strSendBuffer) - 1,
                        "CALL Chat sChatText=%s&sFontFace=MS%%20Shell%%20Dlg&arfFontFlags=0&eFontColor=12345&eFontCharSet=1\r\n",
                        strBuffer);
 
                    SendDataFnc(pThisPtr, strSendBuffer, (unsigned int)strlen(strSendBuffer), 0, 1);
                }
            }
                break;
        }
    default:
        return FALSE;
    }
    return TRUE;
}
 
DWORD WINAPI DlgThread(LPVOID hModule)
{
    return (DWORD)DialogBox((HINSTANCE)hModule, MAKEINTRESOURCE(DLG_MAIN), NULL, DialogProc);
}

Here is an example of it at work:
customchat

Additional Final Words

Some other fun things to mess with:

  • Logging can be enabled by patching out
.text : 000007FF34FAB6FA                 cmp     cs : ? m_loggingEnabled@@3_NA, 0; bool m_loggingEnabled
.text : 000007FF34FAB701                 jz      short loc_7FF34FAB77E

and creating a “LoggingEnabled” expandable string registry key at HKEY_CURRENT_USER/Software/Microsoft/zone.com. The logs provide tons of debug output about the internal state changes of the application, e.g.

[Time: 05-01-2014 21:48:59.253]
CStadiumProxyBase::SetInternalState:
    OLD STATE:    0 (IST_NOT_CONNECTED)
    NEW STATE:    2 (IST_JOIN_PENDING)
    NEW STATUS:   1 (STADIUM_CONNECTION_CONNECTING)
    LIGHT STATUS: 0 (STADIUM_CONNECTION_NOT_CONNECTED)
    m_pFullState:    0x00000000
  • The values in the ZS_PublicELO and ZS_PrivateELO tags can be modified to be much higher values. If you do this on two clients you are guaranteed a match against yourself, unless someone else is also doing this.
  • The games have some cases where they do not perform full santization of game state, so making impossible moves is sometimes allowed.

The full source code relating to this can be found here.

April 30, 2014

Messing with MSN Internet Games (1/2)

Filed under: Game Hacking,General x86-64,Reverse Engineering — admin @ 12:08 AM

This post will entail the fun endeavors of reverse engineering the default MSN Internet Games that come with most “Professional” and higher versions of Windows (although discontinued from Windows 8 onwards). Namely the common protocol shared by Internet Backgammon, Internet Checkers, and Internet Spades.

backgammonUpon launching the game and connecting with another player, the first thing to do is to check what port everything is running on. In this case, it was port 443, which is the port most commonly used for SSL. This has the advantage of giving away a known protocol, but the disadvantage of not being able to read/modify any of the outgoing data. It can also mean that there is a custom protocol that is encrypted and has an SSL layer added on top before going out, but fortunately that is not the case here (spoilers).

ipStarting Point

Since SSL consists of part of the network code, the most logical place to start is in those respective modules which carry out the work: ncrypt.dll and bcrypt.dll. The prime target here is the SslEncryptPacket function. Presumably, this function will be called somewhere in the chain leading up in to the packet leaving the client. Per MSDN, two of the parameters for the function are:

pbInput [in]

    A pointer to the buffer that contains the packet to be encrypted.
cbInput [in]

    The length, in bytes, of the pbInput buffer.

If we can intercept the function call and inspect those parameters, there is a chance of being able to view the data that is leaving the client. If not, then inspecting further down the call stack will eventually lead to the plaintext anyway. There is also a corresponding SslDecryptPacket function which will serve as a starting point to getting and inspecting server responses.

The Plan

The plan of action is pretty straightforward.

  • Get into the address space of the target executable. This will be done through a simple DLL injection.
  • Find the target function for encrypting data (SslEncryptPacket) and decrypting data (follow call from SslDecryptPacket down).
  • Install hooks on these two functions. The chosen method will be through memory breakpoints.
  • Inspect the contents of incoming and outgoing messages in plaintext. Determine the protocol and begin messing with it.

The first step won’t be covered here due to the hundreds of different DLL injection tutorials/guides/tools already out there. The code in the injected DLL will be a pretty direct translation of the above steps. Something akin to the code below:

int APIENTRY DllMain(HMODULE hModule, DWORD dwReason, LPVOID lpReserved)
{
    switch(dwReason)
    {
    case DLL_PROCESS_ATTACH:
        (void)DisableThreadLibraryCalls(hModule);
        if(AllocConsole())
        {
            freopen("CONOUT$", "w", stdout);
            SetConsoleTitle(L"Console");
            SetConsoleTextAttribute(GetStdHandle(STD_OUTPUT_HANDLE), FOREGROUND_RED | FOREGROUND_GREEN | FOREGROUND_BLUE);
            printf("DLL loaded.\n");
        }
        if(GetFunctions())
        {
            pExceptionHandler = AddVectoredExceptionHandler(TRUE, VectoredHandler);
            if(SetBreakpoints())
            {
                printf("BCryptHashData: %016X\n"
                    "SslEncryptPacket: %016X\n",
                    BCryptHashDataFnc, SslEncryptPacketFnc);
            }
            else
            {
                printf("Could not set initial breakpoints.\n");
            }
        }
        break;
 
    case DLL_PROCESS_DETACH:
        //Clean up here usually
        break;
 
    case DLL_THREAD_ATTACH:
        break;
 
    case DLL_THREAD_DETACH:
        break;
    }
 
    return TRUE;
}

A “debug console” instance is created to save effort on having to attach a debugger in each testing instance. Pointers to the desired functions are then retrieved through the GetFunctions() function, and lastly memory breakpoints are installed on the two functions (encryption/decryption) to monitor the data being passed to them. For those wondering where BCryptHashData came from, it was traced down from SslDecryptData. It is actually called on both encryption/decryption, but will serve as the point of monitoring received messages from the server (in this post at least).

The second step is very easy and straightforward. By injecting a DLL into the process, we have full access to the process address space, and it is a simple matter of calling GetProcAddress on the desired target functions. This becomes basic WinAPI knowledge.

FARPROC WINAPI GetExport(const HMODULE hModule, const char *pName)
{
    FARPROC pRetProc = (FARPROC)GetProcAddress(hModule, pName);
    if(pRetProc == NULL)
    {
        printf("Could not get address of %s. Last error = %X\n", pName, GetLastError());
    }
 
    return pRetProc;
}
 
const bool GetFunctions(void)
{
    HMODULE hBCryptDll = GetModuleHandle(L"bcrypt.dll");
    HMODULE hNCryptDll = GetModuleHandle(L"ncrypt.dll");
    if(hBCryptDll == NULL)
    {
        printf("Could not get handle to Bcrypt.dll. Last error = %X\n", GetLastError());
        return false;
    }
    if(hNCryptDll == NULL)
    {
        printf("Could not get handle to Bcrypt.dll. Last error = %X\n", GetLastError());
        return false;
    }
    printf("Module handle: %016X\n", hBCryptDll);
 
    BCryptHashDataFnc = (pBCryptHashData)GetExport(hBCryptDll, "BCryptHashData");
    SslEncryptPacketFnc = (pSslEncryptPacket)GetExport(hNCryptDll, "SslEncryptPacket");
 
    return ((BCryptHashDataFnc != NULL) && (SslEncryptPacketFnc != NULL));
}

Installing the hooks (via memory breakpoints) is just an adaptation of the previous post on it. The code looks as follows:

const bool AddBreakpoint(void *pAddress)
{
    SIZE_T dwSuccess = 0;
 
    MEMORY_BASIC_INFORMATION memInfo = { 0 };
    dwSuccess = VirtualQuery(pAddress, &memInfo, sizeof(MEMORY_BASIC_INFORMATION));
    if(dwSuccess == 0)
    {
        printf("VirtualQuery failed on %016X. Last error = %X\n", pAddress, GetLastError());
        return false;
    }
 
    DWORD dwOldProtections = 0;
    dwSuccess = VirtualProtect(pAddress, sizeof(DWORD_PTR), memInfo.Protect | PAGE_GUARD, &dwOldProtections);
    if(dwSuccess == 0)
    {
        printf("VirtualProtect failed on %016X. Last error = %X\n", pAddress, GetLastError());
        return false;
    }
 
    return true;
}
 
const bool SetBreakpoints(void)
{
    bool bRet = AddBreakpoint(BCryptHashDataFnc);
    bRet &= AddBreakpoint(SslEncryptPacketFnc);
 
    return bRet;
}
 
LONG CALLBACK VectoredHandler(EXCEPTION_POINTERS *pExceptionInfo)
{
    if(pExceptionInfo->ExceptionRecord->ExceptionCode == STATUS_GUARD_PAGE_VIOLATION)
    {        
        pExceptionInfo->ContextRecord->EFlags |= 0x100;
 
        DWORD_PTR dwExceptionAddress = (DWORD_PTR)pExceptionInfo->ExceptionRecord->ExceptionAddress;
        CONTEXT *pContext = pExceptionInfo->ContextRecord;
 
        if(dwExceptionAddress == (DWORD_PTR)SslEncryptPacketFnc)
        {
            DWORD_PTR *pdwParametersBase = (DWORD_PTR *)(pContext->Rsp + 0x28);
            SslEncryptPacketHook((NCRYPT_PROV_HANDLE)pContext->Rcx, (NCRYPT_KEY_HANDLE)pContext->Rdx, (PBYTE *)pContext->R8, (DWORD)pContext->R9,
                (PBYTE)(*(pdwParametersBase)), (DWORD)(*(pdwParametersBase + 1)), (DWORD *)(*(pdwParametersBase + 2)), (ULONGLONG)(*(pdwParametersBase + 3)),
                (DWORD)(*(pdwParametersBase + 4)), (DWORD)(*(pdwParametersBase + 5)));
        }
        else if(dwExceptionAddress == (DWORD_PTR)BCryptHashDataFnc)
        {
            BCryptHashDataHook((BCRYPT_HASH_HANDLE)pContext->Rcx, (PUCHAR)pContext->Rdx, (ULONG)pContext->R8, (ULONG)pContext->R9);
        }
 
        return EXCEPTION_CONTINUE_EXECUTION;
    }
 
    if(pExceptionInfo->ExceptionRecord->ExceptionCode == STATUS_SINGLE_STEP)
    {
        (void)SetBreakpoints();
        return EXCEPTION_CONTINUE_EXECUTION;
    }
    return EXCEPTION_CONTINUE_SEARCH;
}

checkersSoftware breakpoints will be set on the memory page that SslEncryptPacket and BCryptHashData are on. When these are hit a STATUS_GUARD_PAGE_VIOLATION will be raised and caught by the topmost vectored exception handler that the injected DLL installed upon load. The exception address will be checked against the two desired target addresses (SslEncryptPacket/BCryptHashData) and an inspection function will be called. In this case it will just echo the contents of the plaintext data buffers out to the debug console instance.  The single-step flag will be set so the program can continue execution by one instruction before raising a STATUS_SINGLE_STEP exception, upon which the memory breakpoints will be reinstalled (since guard page flags are cleared after the page gets accessed). For a more in-depth explanation, see the linked post related to memory breakpoints posted before on this blog.

The x64 ABI (on Windows) stores the first four parameters in RCX, RDX, R8, and R9 respectively, and the rest on the stack. There is no need to worry about locating any extra parameters in the case of BCryptHashData, which only takes four. However, SslEncryptData takes ten parameters, so there are another six to locate. In this case, there is no reason to care beyond the fourth parameter, but all of them are passed in for the sake of completeness. The base of the parameters on the stack were found by looking at how the function is called and verifying with a debugger during runtime.

The “hook” code, as mentioned above, will just print out the data buffers. The implementation is given below:

void WINAPI BCryptHashDataHook(BCRYPT_HASH_HANDLE hHash, PUCHAR pbInput, ULONG cbInput, ULONG dwFlags)
{
    printf("--- BCryptHashData ---\n"
        "Input: %.*s\n",
        cbInput, pbInput);
}
 
void WINAPI SslEncryptPacketHook(NCRYPT_PROV_HANDLE hSslProvider, NCRYPT_KEY_HANDLE hKey, PBYTE *pbInput, DWORD cbInput,
                              PBYTE pbOutput, DWORD cbOutput, DWORD *pcbResult, ULONGLONG SequenceNumber, DWORD dwContentType, DWORD dwFlags)
{
    printf("--- SslEncryptPacket ---\n"
        "Input: %.*s\n",
        cbInput, pbInput);
}

What Does It Look Like?

After everything is completed, it is time to inspect the protocol. Below are some selected packet logs from a session of Checkers.

STATE {some large uuid}
Length: 0x000003CD
 
<?xml version="1.0"?>
<StateMessage xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="h
ttp://www.w3.org/2001/XMLSchema" xsi:type="StateMessageEx" xmlns="http://zone.ms
n.com/stadium/wincheckers/">
  <nSeq>4</nSeq>
  <nRole>0</nRole>
  <eStatus>Ready</eStatus>
  <nTimestamp>578</nTimestamp>
  <sMode>normal</sMode>
  <arTags>
    <Tag>
      <id>chatbyid</id>
      <oValue xsi:type="ChatTag">
        <UserID>numeric user id</UserID>
        <Nickname>numeric nickname</Nickname>
        <Text>SYSTEM_ENTER</Text>
        <FontFace>MS Shell Dlg</FontFace>
        <FontFlags>0</FontFlags>
        <FontColor>255</FontColor>
        <FontCharSet>1</FontCharSet>
        <MessageFlags>2</MessageFlags>
      </oValue>
    </Tag>
    <Tag>
      <id>STag</id>
      <oValue xsi:type="STag">
        <MsgID>StartCountDownTimer</MsgID>
        <MsgIDSbKy />
        <MsgD>0</MsgD>
      </oValue>
    </Tag>
  </arTags>
</StateMessage>
 
STATE {some large uuid}
Length: 0x000006D1
 
<?xml version="1.0"?>
<StateMessage xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="h
ttp://www.w3.org/2001/XMLSchema" xsi:type="StateMessageEx" xmlns="http://zone.ms
n.com/stadium/wincheckers/">
  <nSeq>5</nSeq>
  <nRole>0</nRole>
  <eStatus>Ready</eStatus>
  <nTimestamp>2234</nTimestamp>
  <sMode>normal</sMode>
  <arTags>
    <Tag>
      <id>STag</id>
      <oValue xsi:type="STag">
        <MsgID>FrameworkUpdate</MsgID>
        <MsgIDSbKy />
        <MsgD>&lt;D&gt;&lt;StgSet&gt;&lt;SeatCnt&gt;2&lt;/SeatCnt&gt;&lt;GameT&g
t;AUTOMATCH&lt;/GameT&gt;&lt;AILvls&gt;2&lt;/AILvls&gt;&lt;GameM&gt;INIT_GAME&lt
;/GameM&gt;&lt;Start&gt;True&lt;/Start&gt;&lt;PMatch&gt;False&lt;/PMatch&gt;&lt;
ShowTeam&gt;False&lt;/ShowTeam&gt;&lt;/StgSet&gt;&lt;/D&gt;</MsgD>
      </oValue>
    </Tag>
    <Tag>
      <id>STag</id>
      <oValue xsi:type="STag">
        <MsgID>GameInit</MsgID>
        <MsgIDSbKy>GameInit</MsgIDSbKy>
        <MsgD>&lt;GameInit&gt;&lt;Role&gt;0&lt;/Role&gt;&lt;Players&gt;&lt;Playe
r&gt;&lt;Role&gt;0&lt;/Role&gt;&lt;Name&gt;8201314a      01&lt;/Name&gt;&lt;Type
&gt;Human&lt;/Type&gt;&lt;/Player&gt;&lt;Player&gt;&lt;Role&gt;1&lt;/Role&gt;&lt
;Name&gt;1d220e29      01&lt;/Name&gt;&lt;Type&gt;Human&lt;/Type&gt;&lt;/Player&
gt;&lt;/Players&gt;&lt;Board&gt;&lt;Row&gt;0,1,0,1,0,1,0,1&lt;/Row&gt;&lt;Row&gt
;1,0,1,0,1,0,1,0&lt;/Row&gt;&lt;Row&gt;0,1,0,1,0,1,0,1&lt;/Row&gt;&lt;Row&gt;0,0
,0,0,0,0,0,0&lt;/Row&gt;&lt;Row&gt;0,0,0,0,0,0,0,0&lt;/Row&gt;&lt;Row&gt;3,0,3,0
,3,0,3,0&lt;/Row&gt;&lt;Row&gt;0,3,0,3,0,3,0,3&lt;/Row&gt;&lt;Row&gt;3,0,3,0,3,0
,3,0&lt;/Row&gt;&lt;/Board&gt;&lt;GameType&gt;Standard&lt;/GameType&gt;&lt;/Game
Init&gt;</MsgD>
      </oValue>
    </Tag>
  </arTags>
</StateMessage>
 
CALL EventSend messageID=EventSend&XMLDataString=%3CMessage%3E%3CMove%3E%
3CSource%3E%3CX%3E6%3C/X%3E%3CY%3E5%3C/Y%3E%3C/Source%3E%3CTarget%3E%3CX%3E7%3C/
X%3E%3CY%3E4%3C/Y%3E%3C/Target%3E%3C/Move%3E%3C/Message%3E
 
CALL EventSend messageID=EventSend&XMLDataString=%3CMessage%3E%3CGameMana
gement%3E%3CMethod%3EResignGiven%3C/Method%3E%3C/GameManagement%3E%3C/Message%3E

The protocol basically screams XML-RPC. It appears that the entire state of the game is initialized and carried out over these XML messages. From a security perspective, it also presents an interesting target to fuzz, given the large variety of fields present within these messages, and the presence of a length field in the message.

Some Issues With This Approach

There are some issues with this approach. Firstly, ncrypt.dll and bcrypt.dll are delay loaded, so our DLL will have to be injected after a multiplayer session starts, or there will have to be some polling loop introduced to check whether these two DLLs have loaded. This is ugly and there is a much better way to get around this that will be talked about in the next post. Secondly, BCryptHashData is used for both incoming and outgoing messages. This makes it more difficult if we wish to mess with these messages as there will have to be logic added to distinguish between client and server messages. This will also be resolved in the next post.

The full source code relating to this can be found here.

July 23, 2011

Messing with Protocols: Applications (3/3)

Filed under: Game Hacking,Reverse Engineering — admin @ 2:47 AM

This will be the concluding post of the “Messing with Protocols” series. It will contain some discussion of what was learned and how to mess with the game a bit as a result. The source code provided can be expanded to send any custom chat packets or be used as a starting point in developing a fuzzer. Since the game does not perform integrity checks on parts of the packet such as a valid timer value (this wasn’t discussed but was found while I was reversing recvfrom and onwards), packets can easily be forged by grabbing the session key from any packet. The only field checked is the DWORD value of 06 00 00 00 which was shown to be written in during the building of the chat packet. This means that a custom chat packet can be sent without having to go through the hassle of having to hook the function that increases and writes the timer into the packet (to get the appropriate value if there was a check). This means that writing a custom packet sender is quite easy. The steps would just be: hook sendto to grab the session key and build the packet placing the session key and the 06 00 00 00 bytes in the appropriate offsets. After that, the packet can be filled with whatever data — either garbage data in the case of a fuzzer, or the structure of a legitimate chat packet.
Below is the source to a sample program that can read other players team chat as well as pose as a different player.

#pragma comment(lib, "detours.lib")
#pragma comment(lib, "Ws2_32.lib")
 
#include <Windows.h>
#include <stdio.h>
#include <include/detours.h>
 
#define PLAYER_INDEX 20
#define CHAT_FLAG_INDEX 21
#define CHAT_BROADCAST_INDEX 22
#define CHAT_MESSAGE_START_INDEX 37
 
#define CHAT_FLAG 0xDC
 
static int (WINAPI *psendto)(SOCKET s, const char *buf, int len, int flags, const struct sockaddr *to, int tolen) = sendto;
static int (WINAPI *precvfrom)(SOCKET s, char *buf, int len, int flags, struct sockaddr *from, int *fromlen) = recvfrom;
 
char *ghost_command = NULL;
char *new_packet_out = NULL;
char *ghost_key = "@ghost";
char *spy_key_on = "@spyon";
char *spy_key_off = "@spyoff";
 
unsigned char player_to_ghost = 0xFF;
bool is_spy_on = false;
 
int WINAPI recvfrom_hook(SOCKET s, char *buf, int len, int flags, struct sockaddr *from, int *fromlen) {
    __asm pushad
    if((buf[CHAT_FLAG_INDEX] & 0xFF) == CHAT_FLAG && is_spy_on == true) {
        memset((buf + CHAT_BROADCAST_INDEX), 0x59, 8);
    }
    int ret = precvfrom(s, buf, len, flags, from, fromlen);
    __asm popad
    return ret;
}
 
int WINAPI sendto_hook(SOCKET s, const char *buf, int len, int flags, const struct sockaddr *to, int tolen) {
    __asm pushad
    memcpy(new_packet_out, buf, len);
    if((new_packet_out[CHAT_FLAG_INDEX] & 0xFF) == CHAT_FLAG) {
        if((ghost_command = strstr((new_packet_out + CHAT_MESSAGE_START_INDEX), ghost_key)) != NULL) {
            player_to_ghost = (ghost_command[strlen(ghost_key)] - 0x30) & 0xFF;
            memset((new_packet_out + CHAT_BROADCAST_INDEX), 0x4E, 8);
        }
        if(strstr((new_packet_out + CHAT_MESSAGE_START_INDEX), spy_key_on) != NULL) {
            is_spy_on = true;
            memset((new_packet_out + CHAT_BROADCAST_INDEX), 0x4E, 8);
        }
        else if(strstr((new_packet_out + CHAT_MESSAGE_START_INDEX), spy_key_off) != NULL) {
            is_spy_on = false;
            memset((new_packet_out + CHAT_BROADCAST_INDEX), 0x4E, 8);
        }
        if(player_to_ghost == 0x00 || player_to_ghost > 0x8)
            new_packet_out[PLAYER_INDEX] = 0xF;
        else
        {
            new_packet_out[PLAYER_INDEX] = player_to_ghost;
            new_packet_out[CHAT_BROADCAST_INDEX + (player_to_ghost - 1)] = 0x4E;
        }
    }
    int ret = psendto(s, new_packet_out, len, flags, to, tolen);
    __asm popad
    return ret;
}
 
int APIENTRY DllMain(HMODULE hModule, DWORD reason, LPVOID reserved){
    if(reason == DLL_PROCESS_ATTACH) {
        DisableThreadLibraryCalls(hModule);
        new_packet_out = (char *)malloc(256 * sizeof(char));
        (void)DetourTransactionBegin();
        (void)DetourUpdateThread(GetCurrentThread());
        (void)DetourAttach(&(PVOID&)psendto, sendto_hook);
        (void)DetourAttach(&(PVOID&)precvfrom, recvfrom_hook);
        (void)DetourTransactionCommit();
    }
    return TRUE;
}

The sample takes in three commands provided through chat, @ghost to imitate a player, @spy_on to enable the ability enemy team chat, and @spy_off to disable it. These all work by replacing outgoing or incoming packets. Chat ghosting works through changing the index of the player sending the chat in outgoing packets. The chat spying works by setting the flags on incoming packets to display in the client. The usage is shown below:
The chat from the impersonators perspective, who is impersonating player 3.

The chat visible to other players.

After doing all of the reversing, I actually stumbled across a great article which explains the networking code behind the Age of Empire series and provides exclamations into what the counters mean and the general architecture of the protocol.

A downloadble PDF of this post can be found here.

June 1, 2011

Messing with Protocols: Reverse Engineering (2/3)

Filed under: Game Hacking,Reverse Engineering — admin @ 7:55 PM

Some prerequisite words: The networking code is aggressively optimized and this post will be extremely difficult to comprehend without decent knowledge of assembly and following along in a debugger. Below is my analysis of how a chat packet is constructed within the game, which was analyzed on a running instance with OllyDbg. Also, the knowledge contained in post isn’t necessarily needed for being able to forge packets across an established in-game connection.

The process begins by trying to find out where the game grabs the chat from. Logically, to build a chat packet, there needs to be some chat. This should either be the starting point or near to the starting point of building the chat packet to send. Techniques on finding out this starting point are situational. For example, if the chat were entered directly from some input box, it might be wise to breakpoint calls to GetDlgItemText or something similar and follow from there. For this, I chose the approach of setting a breakpoint on sendto and following the call stack backwards. The packet can be fully inspected on the call to sendto as shown below:

As an aside, this is also a good place to test for things like data checksums or even possible exploits in the protocol. Checking for data checksums is pretty easy — if one of those unknown fields in the packet is a checksum, then modifying the data but keeping the checksum the same will make the receiving client report an error and/or not display the chat. Since it was possible to modify the text and still have it appear on the other side, you can conclude that no data checksum is present (or the receiving code doesn’t check it). Additionally, modifying the length of the chat also allows the chat to get through. This is more interesting because there actually is a checksum on the length appended to the packets. Having looked over the recvfrom code, which won’t be discussed in this post, I will just say that the checksum is not checked by the game. There are other things to check for like overflows which can be invoked by setting the chat length to 0xFF and sending chat greater than 0xFF in size to see if the game parses the packet correctly or not. Overall, I didn’t find anything too interesting that didn’t solely cause my own instance (the one sending the packets) of the game to crash.
Back on topic however, the call stack at sendto is shown below:

The goal is to find out where the packet is built, so the best place to look is the furthest back. Checking DPLAYX.6E2E6333 reveals the following data at [ESP+14]:

which shows a chat packet containing the header. This means that it is necessary to check even further back. The next step is to look at the call from DPLAYX.6E2DCBBA. The call is shown below:

6E2DCBAC   6A 01            PUSH 1
6E2DCBAE   57               PUSH EDI
6E2DCBAF   53               PUSH EBX
6E2DCBB0   FF75 14          PUSH DWORD PTR SS:[EBP+14]
6E2DCBB3   FF75 F8          PUSH DWORD PTR SS:[EBP-8]
6E2DCBB6   FF75 1C          PUSH DWORD PTR SS:[EBP+1C]
6E2DCBB9   50               PUSH EAX
6E2DCBBA   E8 74970000      CALL DPLAYX.6E2E6333

Setting a breakpoint on this shows that the fully built packet is stored in EBX. Tracing backwards up the function, the chat is loaded into EBX by

6E2DCB1E   8B5D 18          MOV EBX,DWORD PTR SS:[EBP+18]

which means it’s necessary to go even further backwards since the packet is still fully built at this point. OllyDbg isn’t too great at keeping track of call stacks when the function isn’t called directly. Setting a breakpoint at the top of the function and checking the call stack showed that nothing called it, which definitely is not correct. The easiest approach is to inspect the stack manually for the point of return. When EBP is set up, the stack looks like the following:

That address leads to the following function, with the call at 0x005D356D:

005D3540  /$ 8B91 10470000  MOV EDX,DWORD PTR DS:[ECX+4710]
005D3546  |. 33C0           XOR EAX,EAX
005D3548  |. 85D2           TEST EDX,EDX
005D354A  |. 75 24          JNZ SHORT aoc1_-_C.005D3570
005D354C  |. 8B5424 14      MOV EDX,DWORD PTR SS:[ESP+14]
005D3550  |. A1 84027900    MOV EAX,DWORD PTR DS:[790284]
005D3555  |. 52             PUSH EDX
005D3556  |. 8B5424 14      MOV EDX,DWORD PTR SS:[ESP+14]
005D355A  |. 52             PUSH EDX
005D355B  |. 8B5424 14      MOV EDX,DWORD PTR SS:[ESP+14]
005D355F  |. 8B08           MOV ECX,DWORD PTR DS:[EAX]
005D3561  |. 52             PUSH EDX
005D3562  |. 8B5424 14      MOV EDX,DWORD PTR SS:[ESP+14]
005D3566  |. 52             PUSH EDX
005D3567  |. 8B5424 14      MOV EDX,DWORD PTR SS:[ESP+14]
005D356B  |. 52             PUSH EDX
005D356C  |. 50             PUSH EAX
005D356D  |. FF51 68        CALL DWORD PTR DS:[ECX+68]
005D3570     C2 1400        RETN 14

This is beginning to lead in the right direction since the call is made directly from the game instead of an additional library. Additionally, this function is only called when chat is to be sent. All other ones prior to this were called when any packet was to be sent. The chat is loaded into EDX at the following instruction:

005D355A  |. 52             PUSH EDX

The packet is still fully built at this point, but the search is almost over. The problem and approach are still the same, but now its been heavily isolated. The call stack from this point should contain functions that deal with collecting data and constructing the appropriate packet, instead of networking functions in the DirectPlay and Winsock libraries. Taking the same approach as before and setting a breakpoint at the top of the function shows the following call stack:

Looking at 0x005DC6720, one of the further functions down the call stack, begins to show some promise. When this function is entered, the chat string is held in the EAX register. It can be modified and the changes carry on through to the sendto function. This means that it’s not a function working on a temporary copy of the buffer, but that it holds the “real” one that will be written into the chat packet. This seems like a good starting point since there is no sign of any packet header being built. Additionally, inspecting the other functions up the call stack from 0x005DC720 shows that some of them display debug messages dealing with packets and chat headers. This is also a good sign that the real reversing should begin here.

What I personally prefer doing is following the code flow in OllyDbg then highlighting the executed code path in IDA Pro. The assembly listings from IDA will be shown in the code blocks to follow. The function starts at 0x005D6720:

.text:005D6720 ; int __stdcall send_normal(char *Str)
.text:005D6720 send_normal     proc near               ; CODE XREF: sub_4FD360+BDFp
.text:005D6720                                         ; sub_5A6C90+154p ...
.text:005D6720
.text:005D6720 broadcast       = dword ptr -0Ch
.text:005D6720 var_8           = dword ptr -8
.text:005D6720 var_4           = word ptr -4
.text:005D6720 Str             = dword ptr  4
.text:005D6720
.text:005D6720                 sub     esp, 0Ch
.text:005D6723                 mov     eax, 59595959h
.text:005D6728                 push    esi
.text:005D6729                 mov     [esp+10h+broadcast], eax
.text:005D672D                 mov     esi, ecx
.text:005D672F                 mov     [esp+10h+var_8], eax
.text:005D6733                 mov     [esp+10h+var_4], ax
.text:005D6738                 mov     eax, [esi+10E4h] ; number of other players in game
.text:005D673E                 test    eax, eax
.text:005D6740                 jnz     short loc_5D6780

The first thing to notice is that no stack frame is set up. This is the beginning of a lot of incoming optimized code. Everything function to come will not set up a stack frame and will work directly relative to ESP. The next thing to notice is that the value 0x59595959 is moved into EAX, which is then moved into [ESP+0x4]. Why 0x59595959? Looking back at the previous example packets, there were eight bytes devoted to who can see the packet. These were set to either Y or N depending on whether the target player is supposed to display the message or not. 0x59 happens to be the ASCII code for ‘Y’, so the beginning of this function sets up to send a message to all players, i.e. the broadcast field in the packet will be YYYYYYYY (or 0x59 0x59 … 0x59). [ESI+0x10E4] holds the number of other players in the game. This is moved into EAX and checked for zero. Assuming there is at least one additional player, the code jumps to 0x005D6780. The case where no other player is in game won’t be discussed in-depth except that in that case no packets are sent and the text is only displayed on the screen. Continuing on at 0x005D6780 is the following block:

.text:005D6780 loc_5D6780:                             ; CODE XREF: send_normal+20j
.text:005D6780                 mov     eax, [esp+14h]
.text:005D6784                 mov     edx, [esi+10E0h]
.text:005D678A                 lea     ecx, [esp+10h+broadcast]
.text:005D678E                 push    eax             ; chat string
.text:005D678F                 push    ecx             ; broadcast audience
.text:005D6790                 push    edx             ; player_index
.text:005D6791                 mov     ecx, esi
.text:005D6793                 call    sub_5D67A0
.text:005D6798                 pop     esi
.text:005D6799                 add     esp, 0Ch
.text:005D679C                 retn    4
.text:005D679C send_normal     endp

Looking at this is just a matter of stepping over in OllyDbg. The function at 0x005D67A0 is called with the player index, the broadcast audience, and the chat message as parameters. This function is a bit longer and more complicated. The first block is shown below:

.text:005D67A0 ; int __stdcall sub_5D67A0(int player_index, int chat_message, char *Str)
.text:005D67A0 sub_5D67A0      proc near               ; CODE XREF: .text:004A8961p
.text:005D67A0                                         ; sub_4A8970+5Bp ...
.text:005D67A0
.text:005D67A0 var_114         = byte ptr -114h
.text:005D67A0 var_113         = byte ptr -113h
.text:005D67A0 chat_length     = dword ptr -108h
.text:005D67A0 var_104         = byte ptr -104h
.text:005D67A0 Dest            = byte ptr -103h
.text:005D67A0 var_4           = byte ptr -4
.text:005D67A0 player_index    = dword ptr  4
.text:005D67A0 chat_message    = dword ptr  8
.text:005D67A0 Str             = dword ptr  0Ch
.text:005D67A0
.text:005D67A0                 sub     esp, 114h
.text:005D67A6                 xor     eax, eax
.text:005D67A8                 push    ebx
.text:005D67A9                 push    ebp
.text:005D67AA                 mov     ebp, ecx
.text:005D67AC                 push    esi
.text:005D67AD                 mov     ecx, [esp+120h+player_index]
.text:005D67B4                 push    edi
.text:005D67B5                 mov     ax, [ebp+12DCh] ; maximum number of players
.text:005D67BC                 cmp     ecx, eax
.text:005D67BE                 jbe     short loc_5D67D2

This code will be tricky to go through because things will be referenced through EBP and ESP. The important immediate thing to note is that EBP takes the value of ECX, which had ESI moved into it prior to the function call. As usual, no stack frame is set up. All that this code does is compare the index of the player sending the chat (stored in [EBP+0x12DC]) with the maximum number of players allowed in the game. The error condition is that the player index is greater than the maximum number allowed (e.g. player 9 is trying to send a message in an 8 player game). Assuming no error, the jump to 0x005D67D2 is taken. This is a pretty small block which performs an unusual check:

.text:005D67D2 loc_5D67D2:                             ; CODE XREF: sub_5D67A0+1Ej
.text:005D67D2                 mov     ebx, [esp+124h+Str]
.text:005D67D9                 push    ebx             ; Str
.text:005D67DA                 call    _atoi
.text:005D67DF                 mov     esi, 1
.text:005D67E4                 add     esp, 4
.text:005D67E7                 cmp     [ebp+12DCh], si ; [ebp+12DCh] holds number of players
.text:005D67EE                 mov     edi, eax
.text:005D67F0                 jb      loc_5D6883

The purpose of the call to atoi is to check whether a taunt has been entered. Taunts are represented in chat purely by numbers, so atoi will return non-zero if that is the case. The code then continues to check whether more than one player is in the game. This is to determine whether it’s necessary to build a packet or not. Assuming more than one player is in the game, the jump to 0x005D6883 is not taken. A loop is then entered. The important parts of the loop are reproduced below. These are the instructions that are executed when more than one player is to receive some chat.

.text:005D67F6 loc_5D67F6:                             ; CODE XREF: sub_5D67A0+DDj
.text:005D67F6                 push    esi             ; begin loop to see who is allowed to see message
.text:005D67F7                 mov     ecx, ebp
.text:005D67F9                 call    can_player_see
.text:005D67FE                 test    eax, eax
.text:005D6800                 jnz     short loc_5D680E ; player allowed to see message
.text:005D6802                 push    esi             ; player number
.text:005D6803                 mov     ecx, ebp
.text:005D6805                 call    sub_5D9720
.text:005D680A                 test    eax, eax
.text:005D680C                 jz      short loc_5D686C
.text:005D680E
.text:005D680E loc_5D680E:                             ; CODE XREF: sub_5D67A0+60j
.text:005D680E                 mov     ecx, [esp+124h+chat_message] ; player allowed to see message
.text:005D6815                 cmp     byte ptr [esi+ecx], 59h
.text:005D6819                 jnz     short loc_5D686C
.text:005D681B                 mov     edx, dword_7912A0 ; jump not taken, player allowed to see message
.text:005D6821                 mov     [esp+esi+124h+var_113], 59h ; allowed flag
.text:005D6826                 mov     eax, [edx+424h]
.text:005D682C                 test    eax, eax
.text:005D682E                 jz      short loc_5D6871
.text:005D6871
.text:005D6871 loc_5D6871:                             ; CODE XREF: sub_5D67A0+8Ej
.text:005D6871                                         ; sub_5D67A0+98j ...
.text:005D6871                 xor     eax, eax
.text:005D6873                 inc     esi             ; see if next player is allowed to see message
.text:005D6874                 mov     ax, [ebp+12DCh] ; maximum number of players in game
.text:005D687B                 cmp     esi, eax
.text:005D687D                 jbe     loc_5D67F6      ; begin loop to see who is allowed to see message
.text:005D6883
.text:005D6883 loc_5D6883:                             ; CODE XREF: sub_5D67A0+50j
.text:005D6883                 mov     eax, [ebp+10E4h] ; total number of players in game
.text:005D6889                 test    eax, eax
.text:005D688B                 jnz     loc_5D6915

The can_player_see function at 0x005D96E0 just returns 1 or 0 depending on whether the player can see the message. What this loop basically does is decide who is going to see this message. The YYY…Y buffer that was passed in to this function gets modified here according to which player can see the message. The following two instructions set the appropriate byte:

.text:005D6821                 mov     [esp+esi+124h+var_113], 59h ; allowed flag

or

.text:005D686C                 mov     [esp+esi+124h+var_113], 4Eh ; not allowed flag

The loop continues for eight iterations, the maximum number of players in the game. Once this loop is done, the part of the packet which will hold who can see the message is ready. Once the loop exits, the following code blocks are executed:

.text:005D6883 loc_5D6883:                             ; CODE XREF: sub_5D67A0+50j
.text:005D6883                 mov     eax, [ebp+10E4h] ; total number of players in game
.text:005D6889                 test    eax, eax
.text:005D688B                 jnz     loc_5D6915
.text:005D6915 ; ---------------------------------------------------------------------------
.text:005D6915
.text:005D6915 loc_5D6915:                             ; CODE XREF: sub_5D67A0+EBj
.text:005D6915                                         ; sub_5D67A0+164j
.text:005D6915                 mov     ecx, [ebp+10E0h]
.text:005D691B                 cmp     [esp+ecx+124h+var_113], 59h
.text:005D6920                 jnz     short loc_5D6957 ; EDI holds chat string
.text:005D6922                 mov     eax, [esp+124h+player_index]
.text:005D6929                 mov     edx, [ebp+12CCh] ; username sending chat
.text:005D692F                 mov     ecx, [ebp+18h]
.text:005D6932                 push    0
.text:005D6934                 push    0
.text:005D6936                 push    eax
.text:005D6937                 shl     eax, 7
.text:005D693A                 add     edx, eax        ; EDX holds username
.text:005D693C                 push    ebx             ; EBX holds chat
.text:005D693D                 push    edx
.text:005D693E                 call    sub_5E2780
.text:005D6943                 mov     eax, dword_78BF34
.text:005D6948                 push    ebx
.text:005D6949                 push    offset aLocalChatAddS ; "Local chat add: %s"
.text:005D694E                 push    eax
.text:005D694F                 call    nullsub_1       ; debug function?
.text:005D6954                 add     esp, 0Ch
.text:005D6957
.text:005D6957 loc_5D6957:                             ; CODE XREF: sub_5D67A0+180j
.text:005D6957                 mov     edi, ebx        ; EDI holds chat string
.text:005D6959                 or      ecx, 0FFFFFFFFh
.text:005D695C                 xor     eax, eax
.text:005D695E                 repne scasb             ; Calculates length of string
.text:005D6960                 not     ecx
.text:005D6962                 dec     ecx             ; ECX holds number of characters
.text:005D6963                 cmp     ecx, 0FFh
.text:005D6969                 jbe     short loc_5D6972 ; Calculates length of string again

There isn’t anything too special about this block. There is a call to 0x005E2780, which is a huge and complicated function. Fortunately, it doesn’t do anything related to modifying the chat or building a packet — so it doesn’t have to be analyzed. Other than that there is nothing going on except for calculating the length of the string. This is compared against 0xFF, which is the maximum number of characters allowed. Normal execution continues into the following blocks:

.text:005D6972 ; ---------------------------------------------------------------------------
.text:005D6972
.text:005D6972 loc_5D6972:                             ; CODE XREF: sub_5D67A0+1C9j
.text:005D6972                 mov     edi, ebx        ; Calculates length of string again
.text:005D6974                 or      ecx, 0FFFFFFFFh
.text:005D6977                 xor     eax, eax
.text:005D6979                 repne scasb
.text:005D697B                 not     ecx
.text:005D697D                 dec     ecx
.text:005D697E
.text:005D697E loc_5D697E:                             ; CODE XREF: sub_5D67A0+1D0j
.text:005D697E                 mov     dl, byte ptr [esp+124h+player_index]
.text:005D6985                 mov     [esp+124h+chat_length], ecx ; Chat length added to buffer
.text:005D6989                 inc     ecx
.text:005D698A                 lea     eax, [esp+124h+Dest]
.text:005D698E                 push    ecx             ; Count
.text:005D698F                 push    ebx             ; Source
.text:005D6990                 push    eax             ; Dest
.text:005D6991                 mov     [esp+130h+var_114], dl ; Player number who sent message
.text:005D6995                 call    _strncpy
.text:005D699A                 mov     ecx, [ebp+1E3Ch] ; 3?
.text:005D69A0                 mov     edx, [esp+130h+chat_length]
.text:005D69A4                 add     esp, 0Ch
.text:005D69A7                 cmp     ecx, 3
.text:005D69AA                 setz    cl              ; CL = 1
.text:005D69AD                 add     edx, 16h
.text:005D69B0                 push    0
.text:005D69B2                 lea     eax, [esp+128h+var_114]
.text:005D69B6                 push    edx             ; Chat length + 0x16
.text:005D69B7                 push    eax             ; Partial packet. Contains chat flag, chat length, who is allowed to see, and message
.text:005D69B8                 mov     [esp+130h+var_104], cl
.text:005D69BC                 push    43h
.text:005D69BE                 push    0
.text:005D69C0                 mov     ecx, ebp
.text:005D69C2                 mov     [esp+138h+var_4], 0
.text:005D69CA                 call    sub_5D7BC0
.text:005D69CF                 mov     ecx, [ebp+1608h]
.text:005D69D5                 mov     esi, eax
.text:005D69D7                 push    offset aTxchat  ; "TXChat()"
.text:005D69DC                 push    esi
.text:005D69DD                 call    sub_5DF900
.text:005D69E2                 mov     eax, esi
.text:005D69E4                 pop     edi
.text:005D69E5                 pop     esi
.text:005D69E6                 pop     ebp
.text:005D69E7                 pop     ebx
.text:005D69E8                 add     esp, 114h
.text:005D69EE                 retn    0Ch
.text:005D69EE sub_5D67A0      endp

Prior to entering this code block the packet buffer contains only who is allowed to see the message. The actual packet buffer is stored relative to ESP and the data will be written directly there. There are some parts that are difficult to analyze like magic values appearing out of nowhere, i.e., [EBP+0x1E3C] holding the value of 3. These don’t affect understanding the code too much unless they play some vital role in how the actual packet will be built (the value is a flag parameter for a function, etc.). With the code above, the the CL register is set to 1 since 3 == 3, and is written into the packet buffer. Fortunately, one beneficial thing about this optimized code is that it’s easy to see where the packet is being built. The writes into [ESP+xx]  hold the buffer for the packet, which can be verified by inspecting it in the dump with OllyDbg. Checking all places where this occurs and seeing what is being written, the packet will have the following fields before entering the function at 0x005D7BC0: A field set to 0x1 (the CL value being written in), who is allowed to see the chat, the chat message itself, and an additional special value 0xDC. This value is always found in the same field and always has the same value. A view of the packet from the dump is shown below:

The call to 0x005D7BC0 will add the remaining fields of the packet (the two supposed counter values, and fields marked as unknown in the first post). Since this function is pretty big, I won’t duplicate the entire thing here, but only relevant parts — this post is meant to be followed along with in a disassembler after all. This function allocates a new block of memory and returns the entire packet to send (excluding the fields added by DirectPlay’s networking code). The “second counter” bytes are written in by the instructions listed below:

.text:005D7C1A loc_5D7C1A:                             ; CODE XREF: sub_5D7BC0+39j
.text:005D7C1A                 push    0Ch             ; unsigned int
.text:005D7C1C                 call    ??2@YAPAXI@Z    ; operator new(uint)
.text:005D7C21                 mov     edx, [esp+34h+arg_C]
.text:005D7C25                 mov     ecx, [ebp+4714h] ;time stamp
.text:005D7C2B                 mov     edi, eax
.text:005D7C2D                 mov     al, [ebp+1DD0h] ; 1
.text:005D7C33                 add     esp, 4
.text:005D7C36                 add     ecx, edx
.text:005D7C38                 test    al, al
.text:005D7C3A                 mov     [esp+30h+Memory], edi
.text:005D7C3E                 mov     [ebp+4714h], ecx ; 0x51E
.text:005D7C44                 jz      short loc_5D7C7A
.text:005D7C46                 mov     ecx, [ebp+1DA0h] ; Retrieve second counter
.text:005D7C4C                 mov     [edi+8], ecx    ; Write value into packet
.text:005D7C4F                 mov     eax, [ebp+10E0h] ; player position
.text:005D7C55                 mov     ecx, [ebp+1DA0h] ; second counter value
.text:005D7C5B                 mov     [ebp+eax*4+1DE4h], ecx
.text:005D7C62                 mov     eax, [ebp+1DA0h] ; Get second counter value
.text:005D7C68                 inc     eax             ; Increment it for next packet
.text:005D7C69                 mov     [ebp+1DA0h], eax ; Write new value back in
.text:005D7C6F                 mov     eax, 0Ch
.text:005D7C74                 mov     [esp+30h+var_20], eax
.text:005D7C78                 jmp     short loc_5D7C89
.text:005D7C89
.text:005D7C89 loc_5D7C89:                             ; CODE XREF: sub_5D7BC0+B8j
.text:005D7C89                 lea     ecx, [eax+edx]  ; size of header + data
.text:005D7C8C                 cmp     ecx, 0FA0h      ; maximum packet size
.text:005D7C92                 mov     [esp+30h+var_1C], ecx ; write in full size
.text:005D7C96                 jbe     short loc_5D7CB3

Again there are some magic values that seemingly appear out of nowhere. There are some more familiar ones like [EBP+0x10E0], which was shown in previous functions as storing the player index. The value of the counter is kept at [EBP+0x1DA0], which is written into the packet through EDI. The timestamp bytes are also written in, which are apparently kept at [EBP+0x4714]. The code in the second block just grabs the counter, writes it to the packet buffer, and increments it for the next time it is used. The next block writes in the size of the packet so far. This will be a check to make sure that the header and data do not exceed the maximum length allowed for a chat packet. The next field is an unknown field that is written by:

.text:005D7CB3
.text:005D7CB3 loc_5D7CB3:                             ; CODE XREF: sub_5D7BC0+D6j
.text:005D7CB3                 xor     eax, eax
.text:005D7CB5                 mov     byte ptr [edi+1], 0
.text:005D7CB9                 mov     byte ptr [edi], 0
.text:005D7CBC                 mov     al, [ebp+1E74h] ; Number of players
.text:005D7CC2                 mov     ecx, [ebp+10C8h] ; 2?
.text:005D7CC8                 lea     ebx, [ebp+1DA8h] ; 4?
.text:005D7CCE                 add     eax, ecx
.text:005D7CD0                 mov     [esp+30h+var_14], eax
.text:005D7CD4                 mov     [edi+4], eax    ; 06 00 00 00 added here

This writes in 06 00 00 00 to the packet, which comes from adding [EBP+0x10C8] and [EBP+0x1DA8] together. Where these two things come from, I’m not entirely quite sure. However, they remain constant across all packets regardless of size, player index, and so on, so it’s not too important for being able to forge packets. The other counter value can be found at

.text:005D7D22                 mov     ecx, [esp+30h+var_14] ; Full packet size
.text:005D7D26                 push    ecx
.text:005D7D27                 mov     ecx, ebp
.text:005D7D29                 call    sub_5D7B50
.text:005D7D2E                 mov     [edi+1], al     ; Counter value
.text:005D7D31                 mov     ecx, [ebp+1E3Ch] ; 3?
.text:005D7D37                 cmp     ecx, 5
.text:005D7D3A                 jz      short loc_5D7D6C
.text:005D7D3C                 test    al, al
.text:005D7D3E                 jnz     short loc_5D7D6C

in the call to 0x005D7B50, which returns the counter byte. This can be seen here

.text:005D7B6F ; ---------------------------------------------------------------------------
.text:005D7B6F
.text:005D7B6F loc_5D7B6F:                             ; CODE XREF: sub_5D7B50+Cj
.text:005D7B6F                 mov     al, byte ptr dword_790FAC ; Get counter byte
.text:005D7B74                 inc     al              ; Increment counter byte
.text:005D7B76                 cmp     al, 0FFh        ; Compare with max allowed
.text:005D7B78                 mov     byte ptr dword_790FAC, al ; Write value back
.text:005D7B7D                 jb      short loc_5D7BA3

which just gets the byte, increments it, and resets it if it’s greater than 0xFF. The last important block of code is shown below:

.text:005D7D6C
.text:005D7D6C loc_5D7D6C:                             ; CODE XREF: sub_5D7BC0+17Aj
.text:005D7D6C                                         ; sub_5D7BC0+17Ej
.text:005D7D6C                 mov     eax, [esp+30h+var_1C] ; Full packet size
.text:005D7D70                 mov     [edi], bl       ; Packet type 60
.text:005D7D72                 inc     eax
.text:005D7D73                 push    eax             ; unsigned int
.text:005D7D74                 call    ??2@YAPAXI@Z    ; operator new(uint)
.text:005D7D79                 mov     ebx, eax        ; Buffer size of the packet header + data
.text:005D7D7B                 mov     eax, [esp+34h+var_20] ; 0xC?
.text:005D7D7F                 mov     ecx, eax
.text:005D7D81                 mov     esi, edi
.text:005D7D83                 mov     edx, ecx
.text:005D7D85                 mov     edi, ebx
.text:005D7D87                 shr     ecx, 2
.text:005D7D8A                 rep movsd               ; Packet minus session key is written here
.text:005D7D8C                 mov     ecx, edx
.text:005D7D8E                 add     esp, 4
.text:005D7D91                 and     ecx, 3
.text:005D7D94                 mov     [esp+30h+var_10], ebx
.text:005D7D98                 rep movsb
.text:005D7D9A                 mov     ecx, [esp+30h+arg_C]
.text:005D7D9E                 test    ecx, ecx
.text:005D7DA0                 jz      short loc_5D7DB7

Prior to entering this block, the packet is fully built. This is responsible for creating the buffer for the full packet to send and copying the contents in there. There’s not much more to it at this point. Looking down the function, it will call into another function which in turn calls into DirectPlay (seen earlier in this post).
The entire explanation has basically been following across the call stack and seeing how the chat input is transformed from it’s initial stage into a fully built packet. The specific knowledge gained may not have been too great — there are still those unknown fields — however, it was useful to see that there are special checksums or integrity checks performed on the data which will go in to the packet. Also, it was possible to learn how those counters in the packet function, how their values change, and if their values have any effect on how data will be transmitted. I don’t expect this explanation to be incredibly clear since it was written over a period of a few days, but these are some notes that I wanted to publish online both for my records and as a demonstration of how difficult and confusing it can be to even reverse one specific type of packet in a protocol. Unfortunately, the code is extremely optimized so this post may not serve as a great starting point into reversing protocols altogether, but the general idea should be the same of finding out where and how certain input is transformed into a transmittable packet.

A downloadable PDF of this post can be found here.

« Newer PostsOlder Posts »

Powered by WordPress