Jump to content
Tuts 4 You

[TitanEngine] Help finding user-defined functions/procedures


Stasis

Recommended Posts

Hi everyone,

Instead of writing IDC script with IDApro, i would like to integrate what IDApro can do into a simple c++ interface using TitanEngine.

TitanEngine is a good tool with detailed SDK which can perform disassembling, PE modification, hooking etc.

I would like to know how i can retrieve all the addresses of user-defined functions in a C++ .exe program using TitanEngine.

If i am to create a dummy c++ file with 3 functions: void func1(), void func2(), void func3(),

which API can i use to retireve the 3 functions after disassembling with TitanEngine?

IDAPRO has flirt signature and name view which automate the analyzing of all user-defined functions...

Is there a byte pattern to retrieve all user-defined functions with TitanEngine? using just pure static disasm?

i am not talking about DLL or external module linking, just native user-defined functions within a C++ executable.

I went through the SDK of titanengine but am not familiar with the API.

Thanks all in advance.

Cheers.

Link to comment

You could search for the CC CC CC blocks that separate functions, I don't think there is such function in TitanEngine, better try TitaniumCore

Mr. eXoDia

Link to comment

cc-blocks are used by recent msc++ compilers. There are a lot more common options for filling up space between functions (mov eax,eax), not to mention inlined data (done by borland compilers). And at last, there might not be a filling block at all.

There is no good way to do this. You could walk the code flow starting at the EP using a disasm engine and trace into calls, then try to identify "custom" functions.

if you have access to the source, you could store pointers to the different functions. Even debuggin infos (.dbg, .pdb) would be enough,

What`s te difference between TitanEngine and TitaniumCore?

Link to comment

TitanEngine does not take care of binary analysis for you. What it does is help perform common tasks during debugging, unpacking and PE modification.

Using IDC or IDAPython is your best bet because you can use its analysis. Trying to reinvent the wheel will take hours and leaves you with something far less usable than what IDA provides.

If you still want to go ahead, you can try scanning for compiler-specific fillers between functions (as Mr. eXoDia suggested), but you're going to have a tough time finding all relevant variations (release vs debug build, different versions of the compiler, etc.).

And that still leaves you with the problem of having to seperate code from data (e.g. metadata in the Delphi code section).

There is academic research, but implementing that is likely more effort than you planned to put into it.

TitaniumCore is a whole different beast, it's meant to perform static format identification, reconstruction and unpacking, that includes anything from PE packers to stealth image files to malformed zip archives. Besides, unlike TitanEngine it's apparently not even available to public.

Link to comment

I wouldn't recomend using TitanEngine 2.0 for procedural analysis. The next version 3.0 which is due in Q2 this year with come with function analysis integrated and it is more suited for the task. However you might not even need that since its your program you want to know the function addresses of. That said you have a couple of simple options:

1) Export the functions you need to know the address of and parse the export table or use GetProcAddress

2) Create a map file while building the application, it has the addresses of all procedures

3) Create a pdb file which has every line of the code associated with the address in the file and use dbghelp to parse it

To answer the other question, TitanEngine is the framework to build PE tools and

is a powerfull static analysis platform that identifies, validates and decomposes file content while producing
about the input package. It is our commercial product which like TitanEngine comes with its SDK.

Best regards

Link to comment

thanks for the clarification, looking forward to v3.

anyways, how would you go about this right now?

scanning for function prologues (push ebp+..., ENTER,..) or epilogues (LEAVE, pop r32+retn, plain retn)?

From experience i can say that the ollydbg analysis feature depends on RETN instructions.

I`m not too enthusiastic about looking for filling-space...

Link to comment

Olly does a really clever thing. It does a bruteforce disassembly of the code section looking for all possible calls inside that section. Once they are identified you have your function starting points from where you try to do a disassm and match a starting point with the return instruction. Additionally you can count the references to each function start to reduce the number of false positives. Pattern matching on function prologue can be done for those addresses that have only one reference. If available data such as import, exports and relocations can be taken into account while trying to differentiate code from data.

Best regards

Edited by ap0x
Link to comment

i am interested in this, because i tried to write a "function-finder" a while ago myself. (but then figured it`s easier to grab the data from an IDA generate .map file).

After processing symbols and walking "safe" functions (EP, exports, debug info) as suggested in my first post, i would also go for the bruteforce approach of check all 0xe8 occurrences, but i found it to be inefficient (although i have to admit the samples i tested it on were...limited wink.png ).

The main problem i had (and this is where i opted for the .map file) was to identify not directly referenced functions. This is especially the case with borland PE files, and i was mainly working with msvc and delphi files.

Additionally, especially the delphi compilers like to compile crap like two-line functions without a pro- or epilogue, while at the same time they emit random dummy bytes right in front of it. sad.png

edit:

btw, the hp in your sig is down, did you move? smile.png

edit2:

the more i think about this, the more i fell like digging up that function-finder project of mine...

Edited by deepzero
Link to comment

The problem is that there are many ways of compiling a single function, I remember when I was working with MSVC6, with diferent parameters, al seems to change, like prologues, epilogues, bytes between functions (like NOP, mov eax, eax, lea eax, [eax]), sometimes there wasn't any ret but jump to another function, and all with msvc, then i tested with delphi 7, and my project was like "hardcoded" for msvc, so all my effort in vain sad.png.

The only idea that could fullfill the purpose of finding all functions was a step-by-step analyser, like I have only one function, the OEP, so when OEP function do a call, then I should add that call to the registry and so on, then, a bruteforce analisys should complete the rest, It worked mainly on debug apps, and on most of release types, but there were always functions that escape the analyser like virtual calls (CALL EAX was the opcode on msvc), function pointers, at the most annoying thing was that sometimes with the bruteforce analisys, the switch table emulates some prologue functions making a false positive.

So the answer like all the things is, "nothing is perfect", I would recommend using a soft that have this things already coded, and recheck the result given by the soft, I've thing that a soft that search functions like "walking opcode by opcode" would give you a high accurate result, with most compilers, but is something that needs time to be programmed and will require time for execution

Edited by Deathway
Link to comment

like I have only one function, the OEP, so when OEP function do a call, then I should add that call to the registry and so on,

yes, that`s what i mean by "walking trusted functions" :)

But especially in delphi, the results are very limited.

the switch table emulates some prologue functions making a false positive.

i dont understand you here... ;)

But switch detection is another topic, and something Oleh apparently spent a lot of time when writing ollydbg, as olly sees switches even where i do not.cool2.gif

it`s fairly obvious when you have a jmp dword [xxx*4+y], but what if you have something like

cmp eax,234234

jcc x

dec eax,4545

jcc x

dec eax, 3434

jcc x

etc.

Link to comment

Hi guys, thanks all for your valued input!

As for the mentioned pdb file, i am trying to code an automated analyzer in c++ which analyze any windows PE executables(not those i coded) and identify function names, offsets etc, automatically like IDAPRO.

I did my scripting in idc, calling idapro cmdLine initally. It seems that i have too many separate modules for a single project and thus i am wondering if i can integrate TitanEngine to do everything conveniently.

I can 'try' to write or code the function analyzer manually but that will really be re-inventing the wheel when there might be tools or sdk out there.

@killboy and Mr.eXoDia:

Yup, i read up on titanium core, useful tool but i wouldnt want to dive into that for my simple educational project.

i am still calling IDC with idapro through a commandline integrated in my program. hope to switch over to titanengine totally.

@deepzero:

So far i am using some bytes pattern for custom named functions but they aren't too accurate against a large variety of test samples.

would really appreciate if you found your function finding codes somewhere. IDA pro/ollydbg is pretty much accurate

unless user-defined function themselves are obfuscated.

@ap0x:

Really looking forward to titanengine v3.. the tool will really save alot of development time.

you mentioned the way of brute forcing and studying the disasm. i tried that but it was all too much effort to spare.

starting from the OEP, tracing and diving into function jmps/calls and making sure to avoid system calls. then note the operations pattern(register store, stack backup). Got lost in the trace.

Edited by Stasis
Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...