fabiothebest Posted September 8, 2016 Posted September 8, 2016 I'm studying x86 architecture and assembly in order to have the basis for studying reversing and exploit development. I'm following a course on opensecuritytraining.info. I see a Hello World example: push ebp mov ebp, esp push offset aHelloWorld; "Hello world\n" call ds:__imp__printf add esp, 4 mov eax, 1234h pop ebp retn This code was generated by Windows Visual C++ 2005 with buffer overflow protection turned off and disassembled with IDA Pro 4.9 Free Version. I'm trying to understand what each line does. the first line is push ebp. I know ebp stands for base pointer. What is its function? I see that in the second line the value in esp is moved into ebp and searching online I see that there first 2 instructions are very common at the beginning of an assembly program. Though are ebp and esp empty at the beginning? I'm new to assembly. Is ebp used for stack frames, so when we have a function in our code and is it optional for a simple program? Then push offset aHelloWorld; "Hello world\n" The part after ; is a comment so it doesn't get executed right? The first part instead adds the address containing the string Hello World to the stack, right? But where is the string declared? I'm not sure I understand. Then call ds:__imp__printf it seems it's a call to a function, anyway printf is a builtin function right? And does ds stand fordata segment register? Is it used because we are trying to access a memory operand that isn't on the stack? then add esp, 4 do we add 4 bytes to esp? Why? then move eax, 1234h what is 1234h here? then pop ebx..it was pushed at the beginning. is it necessary to pop it at the end? then retn ( i knew about ret for returning a value after calling a function). I read that the n in retn refers to the number of pushed arguments by the caller. It isn't very clear for me. Can you help me to understand? 1
kao Posted September 8, 2016 Posted September 8, 2016 In genera, if you're having problems with specific course, it's better to ask course authors. I haven't seen the course and can't comment on its quality.. Judging by your questions, you should rather find some ebook about assembly language basics. Any one should cover those concepts. Quick answers, as I'm on mobile... #1 - push ebp, mov ebp, esp - google "function prologue" #2 - if you click on aHelloWorld, IDA will show where it's declared. It will be in data segment. #3 - call. There are no builtin functions in assembler. Clicking in IDA will show you.. #4 - add esp, 4 - google "Windows calling conventions" #5 - mov eax, 1234h. eax is used to return a value from function. In C it would look like "return 0x1234;" #6 - pop ebp - google "function epilogue" 1
fabiothebest Posted September 8, 2016 Author Posted September 8, 2016 4 minutes ago, kao said: In genera, if you're having problems with specific course, it's better to ask course authors. I haven't seen the course and can't comment on its quality.. Judging by your questions, you should rather find some ebook about assembly language basics. Any one should cover those concepts. Quick answers, as I'm on mobile... #1 - push ebp, mov ebp, esp - google "function prologue" #2 - if you click on aHelloWorld, IDA will show where it's declared. It will be in data segment. #3 - call. There are no builtin functions in assembler. Clicking in IDA will show you.. #4 - add esp, 4 - google "Windows calling conventions" #5 - mov eax, 1234h. eax is used to return a value from function. In C it would look like "return 0x1234;" #6 - pop ebp - google "function epilogue" Thank you very much. You confirmed some of my thoughts and pointed me in the right direction for some other things. I need to learn more about computer architecture and assembly. 1
evlncrn8 Posted September 8, 2016 Posted September 8, 2016 __imp is also a masm thing for imports.. so its call [printf] which will be in the executable import table, populated when the file is loaded into memory... add esp, 4 is because the prinf function is a c function, ie : not stdcall (google function calling conventions), for c functions its up to the callee to balance the stack (you passed 1 param to the call - the pointer to the hello world text... 1 param in 32 bit = 4 bytes.. thats where the 4 comes from).. if the function was __stdcall type, the add esp, 4 would not be present 1
0xNOP Posted September 8, 2016 Posted September 8, 2016 (edited) BLOCK: A - ###FUNCTION PROLOGUE### (Most functions will always begin like this, you don't really need to be too techincal about this but if you insist, just look for "ASM Function Epilogue") -------------------------------- push ebp mov ebp, esp -------------------------------- BLOCK: B - ###PUSH OFFSET FOR THE STRING REFERENCE "Hello World\n" ON STACK### ---------------------------------------------------------- push offset aHelloWorld; "Hello world\n" ---------------------------------------------------------- BLOCK: C - ###CALL PRINTF FROM THE DATA SEGMENT### --------------------------------------- call ds:__imp__printf --------------------------------------- BLOCK: D - ###ALLOCATE 4 BYTES FOR LOCAL VARIABLE### ----------------------------------------- add esp, 4 ----------------------------------------- BLOCK: E - ###MOVES 1234 TO EAX TO BE USED FOR A RETURN### ----------------------------------------------- mov eax, 1234h ----------------------------------------------- BLOCK: F - ###FUNCTION EPILOGUE### ----------------------- pop ebp ----------------------- BLOCK: G - ###RETURN TO THE CALLER### -------------------------- retn -------------------------- BLOCK A:https://en.wikipedia.org/wiki/Function_prologue#Prologue BLOCK B:http://stackoverflow.com/a/17634965/3874785 BLOCK C: ? BLOCK D:https://en.wikibooks.org/wiki/X86_Disassembly/Functions_and_Stack_Frames#Standard_Entry_Sequence BLOCK E:http://stackoverflow.com/questions/6171172/return-value-of-a-c-function-to-asm#6171209 BLOCK F:https://en.wikipedia.org/wiki/Function_prologue#Epilogue That's what I understand so far, hope it helps in something, however Kao expressed it very good as well. Edited September 8, 2016 by 0xNOP 1
evlncrn8 Posted September 8, 2016 Posted September 8, 2016 BLOCK: D - ###ALLOCATE 4 BYTES FOR LOCAL VARIABLE### ----------------------------------------- add esp, 4 ----------------------------------------- is wrong... 4 bytes were used for the push then call to printf... printf is a c function (not stdcall) so the add esp, 4 is to rebalance the stack 2
fabiothebest Posted September 8, 2016 Author Posted September 8, 2016 Thank you very much everyone. Your comments are very useful. I hope I can learn much and contribute as well.
0xNOP Posted September 8, 2016 Posted September 8, 2016 (edited) 4 hours ago, evlncrn8 said: BLOCK: D - ###ALLOCATE 4 BYTES FOR LOCAL VARIABLE### ----------------------------------------- add esp, 4 ----------------------------------------- is wrong... 4 bytes were used for the push then call to printf... printf is a c function (not stdcall) so the add esp, 4 is to rebalance the stack Ahh! Gotcha Now this got me thinking... unless you don't specify (or force) _cdecl or _stdcall calling convention on a function, which is the one default? is it C automatically? See example... void __cdecl HelloWorldCDECL() //force C calling convention { std::cout << "Hello World" << std::endl; } void __stdcall HelloWorldSTDCALL() //force STDCALL calling convention { std::cout << "Hello World" << std::endl; } Edited September 9, 2016 by 0xNOP
evlncrn8 Posted September 9, 2016 Posted September 9, 2016 by default, at least in msvc, its cdecl, unless you explicitly state __stdcall or WINAPI 1
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now