cover image for post 'Practical Reverse Engineering Solutions – Page 35 (Part III)'

Practical Reverse Engineering Solutions – Page 35 (Part III)

my go at exercises 6 on page 35

This blog post presents my solution to exercise 6 on page 35 from the book Practical Reverse Engineering by Bruce Dang, Alexandre Gazet and Elias Bachaalany (ISBN: 1118787315). The book is my first contact with reverse engineering, so take my statements with a grain of salt. All code snippets are on GitHub. For an overview of my solutions consult this progress page

Warning: Of all 4 malware exercises I’m least happy with my results for exercise 6. I have yet to read Chapter 3 and I’ll definitely revisit my solutions as recommended by the book.

Problem Statement

Sample H. The function sub_13846 references several structures whose types are not entirely clear. Your task is to first recover the function prototype and then try to reconstruct the structure fields. After reading Chapter 3, return to this exercise to see if your understanding has changed. (Note: This sample is targeting Windows XP x86.)

Disassembly

Here’s the disassembly of the subroutine sub_13842. (The problem statements in the book give consistently 4 byte larger offset for sample H. From the context it is clear that sub_13846 is the one referred to).

.text:00013842 ; =============== S U B R O U T I N E =======================================
.text:00013842
.text:00013842
.text:00013842 sub_13842       proc near               ; CODE XREF: sub_1386E+2E8p
.text:00013842                                         ; sub_13BE2+84p ...
.text:00013842                 mov     eax, [ecx+60h]
.text:00013845                 push    esi
.text:00013846                 mov     esi, [edx+8]
.text:00013849                 dec     byte ptr [ecx+23h]
.text:0001384C                 sub     eax, 24h
.text:0001384F                 mov     [ecx+60h], eax
.text:00013852                 mov     [eax+14h], edx
.text:00013855                 movzx   eax, byte ptr [eax]
.text:00013858                 push    ecx
.text:00013859                 push    edx
.text:0001385A                 call    dword ptr [esi+eax*4+38h]
.text:0001385E                 pop     esi
.text:0001385F                 retn
.text:0001385F sub_13842       endp

Function Prototype

The routine is probably using the __fastcall calling convention, so ecx is the first function parameter and edx is the second. ecx and edx are probably pointers to structures, and 60h and 8 are offsets to members of this struct. For instance, [ecx+60h] retrieves the member of the struct *ecx at offset
96 bytes. The function prototype is:

unknown_type __fastcall sub_13842(struct _arg_1 *arg_1, struct* _arg_2 arg_2) {
...
}

Walk-Through

The first line copies the member at offset 60h from the first function parameter:

.text:00013842                 mov     eax, [ecx+60h]
struct _v1 *v1 = arg_1->off_60h;

The next two lines do a similar thing for the second function parameter:

.text:00013845                 push    esi
.text:00013846                 mov     esi, [edx+8]
struct _v2 *v2 = arg_2->off_8h;

Then a member of arg_1 is decremented:

.text:00013849                 dec     byte ptr [ecx+23h]
arg_1->off_23h--;

The next lines subtract 36 from v1 and store the result back at arg_1->off_60h:

.text:0001384C                 sub     eax, 24h
.text:0001384F                 mov     [ecx+60h], eax
v1 -= 36;
arg_1->off_60h = v1;

The next line stores the second function parameter arg_2 at v1->off_14h. Note that v1 was reduced by 36 before, the offset 14h actually refers to the negative offset -16 from the original value of arg_1->60h:

.text:00013852                 mov     [eax+14h], edx
v1->off_14h = arg_2;

The next line retrieves the value at v1->off_0h:

.text:00013855                 movzx   eax, byte ptr [eax]
char index = v1->off_0h;

The next two lines push parameters for a __stdcall-function on the stack. The two parameters are arg_1 and arg_2. Note that arg_1 in ecx is pushed first, therefore the two parameters are passed in reverse order. The value esi + 38h is the member v2->off_38h. To this pointer index*4 is added, so v2->off_38h might hold an array of 4 byte DWORDs. Since the values are used in the call instruction, they are actually function pointers.

.text:00013858                 push    ecx
.text:00013859                 push    edx
.text:0001385A                 call    dword ptr [esi+eax*4+38h]
int *func = v2->off_38h[index];
unknown_type return_value = (*func)(arg_2, arg_1);

Finally the routine returns. Since eax is not modified after call, the return value is the return value of func:

.text:0001385E                 pop     esi
.text:0001385F                 retn
return return_value;

C-Code of Subroutine

To summarize, this is the C code:

unknown_type __fastcall sub_13842(struct _arg_1 *arg_1, struct* _arg_2 arg_2) {
    struct _v1 *v1 = arg_1->off_60h;
    struct _v2 *v2 = arg_2->off_8h;
    arg_1->off_23h--;
    v1 -= 36;
    arg_1->off_60h = v1;
    v1->off_14h = arg_2;
    char index = v1->off_0h;
    int *func = v2->off_38h[index];
    unknown_type return_value = (*func)(arg_2, arg_1);
    return return_value;
}

Underlying Structures

_arg_1

The structure for the first function parameter has members at 60h and 23h:

  • The member at 23h is 1 byte in size and is decremented by one in line 4 of the C-code above.
  • At 60h we find something more complicated. First, the value is decremented by 36. Then the struct from argument 2 is assigned to a member at offset 14h. So maybe it is an array of structs of size 36 bytes
struct _arg_1 {
    /* (23h bytes of other members) */
    char off_23h;
    /* (36h bytes of other members) */
    struct _s1 off_60h[??];
}

_s1

As noted above, at arg_1->off_60h there might be an array of structs _s1, which are 36 bytes in size. Of these structs we know two things:

  • at offset 14h we store the structure pointer from parameter 2 _arg_2 (see line 6 of C code)
  • at offset 0 there is an index value (see line 8 of C code)
/* sizeof(s1) = 36 */
struct _s1 {
    char off_0h;
    /* (13h bytes of other members) */
    struct _arg_2 *off_14h;
    /* other members */
}

_arg_2

Line 3 does the only member access of struct _arg_2. The member is again a struct, which is accessed in line 9 of the C code. I call this struct _s2:

struct _arg_2 {
    /* (8 bytes of other members) */
    struct _s2 off_8h;
}

s2

The structure in arg_2->off_8h is read at line 9 of the C code. The struct has a member at offset 38h, which is an array of 4 byte function references:

struct _s2 {
    /* (38h bytes of other members) */
    int* off_38[...];
}

where the elements of the arrays are of type:

(__stdcall **)(struct _arg_1*, struct _arg_2* )

Archived Comments

Note: I removed the Disqus integration in an effort to cut down on bloat. The following comments were retrieved with the export functionality of Disqus. If you have comments, please reach out to me by Twitter or email.

roger samuel Jul 14, 2015 17:17:08 UTC

Hey where did you got the sample? Can you suggest the repositories
malware.lu doesnt have this sample anymore I guess.

Johannes Bader Jul 14, 2015 17:30:05 UTC

You can download all samples of the book here:

https://grsecurity.net/malw...

This particular exercise is about sample_H

roger samuel Jul 15, 2015 17:02:45 UTC

Hey thanks a lot

But the zip file doesn't have Sample H

sameerpatil07 Jul 25, 2016 14:22:45 UTC

Good analysis. Very helpful. Can you plz share sample H(not availble anywhere else)- cb3b2403e1d777c250210d4ed4567cb527cab0f4

Johannes Bader Jul 25, 2016 14:25:36 UTC