Jump to content
Tuts 4 You

Assembly and C++


Aldhard Oswine

Recommended Posts

Aldhard Oswine

I'm trying to learn Reverse Engineering, at this time I compile C++ code without any optimization and see the correspond assembly in IDA's disassembler, but there are some parts of the code, where I can not guess what happens.
I think many of you can help me to answer this questions :)

C++ code:
 

#include <stdio.h>
#include <string>

void fillarray(int is[11], int size)
{
	for (int i = 0; i < size; ++i)
	{
		is[i] = i;
	}
}

int main()
{
	char* chr = "First";
	std::string m = "Secdond";

	static int arr[11];

	std::string m1 = "Third";

	auto size = sizeof(arr) / sizeof(arr[0]);

	fillarray(arr, size);

	int num = 11;

	printf_s("%d", num);


	getchar();

	return 0;
}


The main function in IDA:
 

.text:00401E20
.text:00401E20
.text:00401E20 ; Attributes: bp-based frame
.text:00401E20
.text:00401E20 ; int __cdecl main(int argc, const char **argv, const char **envp)
.text:00401E20 main proc near
.text:00401E20
.text:00401E20 First_via_char= dword ptr -50h
.text:00401E20 var_4C= dword ptr -4Ch
.text:00401E20 var_eleven= dword ptr -48h
.text:00401E20 size= dword ptr -44h
.text:00401E20 Second_via_str= byte ptr -40h
.text:00401E20 Third_via_str= byte ptr -28h
.text:00401E20 CANARY= dword ptr -10h
.text:00401E20 var_C= dword ptr -0Ch
.text:00401E20 var_4= dword ptr -4
.text:00401E20 argc= dword ptr  8
.text:00401E20 argv= dword ptr  0Ch
.text:00401E20 envp= dword ptr  10h
.text:00401E20
.text:00401E20 push    ebp
.text:00401E21 mov     ebp, esp
.text:00401E23 push    0FFFFFFFFh
.text:00401E25 push    offset sub_402D20
.text:00401E2A mov     eax, large fs:0         // what is this? I know that fs:0 is to access PEB, but why this code uses this? 
.text:00401E30 push    eax
.text:00401E31 sub     esp, 44h
.text:00401E34 mov     eax, ___security_cookie
.text:00401E39 xor     eax, ebp
.text:00401E3B mov     [ebp+CANARY], eax       // I thinks it's CANARY
.text:00401E3E push    eax
.text:00401E3F lea     eax, [ebp+var_C]       // what is var_C for?
.text:00401E42 mov     large fs:0, eax        // At this moment, var_C is uninitialized. How and why programm need this?
.text:00401E48 mov     [ebp+First_via_char], offset aFirst ; "First"
.text:00401E4F push    offset aSecdond ; "Secdond"
.text:00401E54 lea     ecx, [ebp+Second_via_str]
.text:00401E57 call    basic_string
.text:00401E5C mov     [ebp+var_4], 0
.text:00401E63 push    offset aThird   ; "Third"
.text:00401E68 lea     ecx, [ebp+Third_via_str]
.text:00401E6B call    basic_string
.text:00401E70 mov     byte ptr [ebp+var_4], 1      // what is var_4?
.text:00401E74 mov     [ebp+size], 0Bh
.text:00401E7B mov     eax, [ebp+size]
.text:00401E7E push    eax
.text:00401E7F push    offset unk_404098 ; address of arrary
.text:00401E84 call    fill_array_function
.text:00401E89 add     esp, 8
.text:00401E8C mov     [ebp+var_eleven], 0Bh
.text:00401E93 mov     ecx, [ebp+var_eleven]
.text:00401E96 push    ecx
.text:00401E97 push    offset aD       ; "%d"
.text:00401E9C call    printf
.text:00401EA1 add     esp, 8
.text:00401EA4 call    ds:getchar
.text:00401EAA mov     [ebp+var_4C], 0
.text:00401EB1 mov     byte ptr [ebp+var_4], 0
.text:00401EB5 lea     ecx, [ebp+Third_via_str]
.text:00401EB8 call    sub_401270                    // What is this? I think it's destructor, but I'm not sure.
.text:00401EBD mov     [ebp+var_4], 0FFFFFFFFh
.text:00401EC4 lea     ecx, [ebp+Second_via_str]
.text:00401EC7 call    sub_401270                    // Destructor?
.text:00401ECC mov     eax, [ebp+var_4C]             // what is var_4C?
.text:00401ECF mov     ecx, [ebp+var_C]              // var_C?
.text:00401ED2 mov     large fs:0, ecx
.text:00401ED9 pop     ecx
.text:00401EDA mov     ecx, [ebp+CANARY]
.text:00401EDD xor     ecx, ebp
.text:00401EDF call    sub_401F68
.text:00401EE4 mov     esp, ebp
.text:00401EE6 pop     ebp
.text:00401EE7 retn
.text:00401EE7 main endp
.text:004

Stack:
 

-00000054                 db ? ; undefined
-00000053                 db ? ; undefined
-00000052                 db ? ; undefined
-00000051                 db ? ; undefined
-00000050 First_via_char  dd ?
-0000004C var_4C          dd ?
-00000048 var_eleven      dd ?
-00000044 size            dd ?
-00000040 Second_via_str  db 24 dup(?)
-00000028 Third_via_str   db 24 dup(?)
-00000010 CANARY          dd ?
-0000000C var_C           dd ?
-00000008                 db ? ; undefined
-00000007                 db ? ; undefined
-00000006                 db ? ; undefined
-00000005                 db ? ; undefined
-00000004 var_4           dd ?
+00000000  s              db 4 dup(?)
+00000004  r              db 4 dup(?)
+00000008 argc            dd ?
+0000000C argv            dd ?                    ; offset
+00000010 envp            dd ?                    ; offset
+00000014
+00000014 ; end of stack variables

 

Edited by Aldhard Oswine
Link to comment

Most of the stuff you didn't recognize is related to C++ exception handling. Google for more details, there should be plenty of articles how it's done.

  • Like 1
Link to comment

What Kao said. Prehaps working with some smaller examples and break up what your end goals are, e.g. Understand how Call and Returns work, how arithmatic works, etc... maybe take a look at the books we recommended in your previous post and try those examples, that way they are explained with some detail of the fundamentals.

Edited by eXit
Agreeing with kao
  • Like 1
Link to comment

Compiling the code with MSVC and opening it in x64dbg should give you line numbers to help you better understand what's going on (Windbg also works but I wouldn't recommend that)...

  • Like 1
Link to comment

If you are using Visual Studio, the best way to do something like this and get a near 1:1 clone of the compiled down code is to do the following:

  • Set the compile mode to Release.
  • Open the project properties and do the following settings:
  • C/C++
    • General
      • Set Debug Information Format to None.
    • Optimization
      • Set Optimization to Disabled
      • Set Inline Function Expansion to Disabled
      • Set Enable Intrinsic Functions to No.
      • Set Whole Program Optimization to No.
    • Code Generation
      • Set Enable C++ Exceptions to No.
      • Set Security Check to Disable Security Check
      • Set Enable Function-Level Linking to No.

When you build the app it should be fairly close to the exact things you wrote out. (Minus the CRT still loading and such.)

You can avoid the CRT loading by resetting the entry point but depending on the code you are writing this can cause issues based on what is used.

 

  • Like 3
Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...