Jump to content
Tuts 4 You

[DevirtualizeMe] VMProtect 2.13.5


HellSpider

Recommended Posts

Thats a hell of a list. Thanks

 

Today i spent a lot of time on it and i must say i picked up a lot about that VM software. Unvirtualized and deobfuscated handler responsible for call instruction, overwrote handler addresses and indeed it worked :) I can say that I see general patterns here. I created the sniffer in DLL (inline hook in main VM handler).

 

I want to go forward, so thats a little question to all the great people here :)

 

Now i must somehow undertand the PCode pointed by ESI.

There are 0x100 entries in the handlers table. As far as i understand, each is responsible for handling particural instruction, like in my case handlers[0x1D] was responsible for CALL sprintf, CALL MessageBoxA and "VM EXIT" (maybe its jut RET and not CALL i guess?). At the entry point of the 0x1D handler the call address is already on the stack, so most probably handler before prepared the call address. There are a lof ot 0x20 handlers. I add a small log of the handlers id and pcode (licznik = counter in my lang). Thanks everyone again for such quick answers! Day ago i was terrified when i saw a VM :D Now im... a boss :D

So now im goin back to work :)

I think i will check the handlers before invoking the msgbox.

I also wonder where he stores the context (registers, flags), that will be my second task.

 

Licznik 1020 Handler address 405E50 Handler ID 15 PCode 47

Licznik 1021 Handler address 404A4D Handler ID 6E PCode F4

Licznik 1022 Handler address 4061C2 Handler ID BF PCode 97

Licznik 1023 Handler address 405E50 Handler ID 15 PCode 19

Licznik 1024 Handler address 4061C2 Handler ID BF PCode 67

Licznik 1025 Handler address 404EB5 Handler ID 17 PCode 23

Licznik 1026 Handler address 404430 Handler ID 47 PCode CD

Licznik 1027 Handler address 40134E Handler ID 61 PCode 9C

Licznik 1028 Handler address 404A4D Handler ID 5E PCode 1C

Licznik 1029 Handler address 404A4D Handler ID 5E PCode D5

Licznik 1030 Handler address 4061C2 Handler ID BF PCode A6

Licznik 1031 Handler address 4061C2 Handler ID EA PCode 34

Licznik 1032 Handler address 4061C2 Handler ID 20 PCode 60

Licznik 1033 Handler address 4061C2 Handler ID 20 PCode 70

Licznik 1034 Handler address 4061C2 Handler ID 20 PCode A4

Licznik 1035 Handler address 4061C2 Handler ID 20 PCode 7C

Licznik 1036 Handler address 4061C2 Handler ID 20 PCode A0

Licznik 1037 Handler address 4061C2 Handler ID 20 PCode 5C

Licznik 1038 Handler address 4061C2 Handler ID BF PCode 0F

Licznik 1039 Handler address 4061C2 Handler ID 20 PCode 07

Licznik 1040 Handler address 4061C2 Handler ID 20 PCode 43

Licznik 1041 Handler address 4052AB Handler ID 1D PCode C9 <-- calls MessageBoxA

The C9 pcode is the return handler. VMP do the jumps using push ret way . for calls its push ,push ,ret.  etc. 1 thing i like to mention in this regard, the truth about "original codes gets lost" lies in the fact that cisc code will translate into small risc handler code like you can see from the call code example. check the vmp reversing documents , you will understand what i mean. for complete restoration , pattern based search and replacement is best way to do so (check deathway's plugin . great example )

Link to comment
Share on other sites

  • 3 weeks later...

I spent some time on VMProtect (im still newbie in VMs tho). I found out where it stores the context (registers, flags), i also see the flow of main VM handler which gets the encrypted VM handlers addresses from table, the ESI pointing to pcode backwards, but where i am gettin stuck is how can i find out what every handler does? I got log of handlers execution from the beginning to the end, but how can i say which handler is doin what? I actually manually copied and deobfuscated the return handler where i clearly see pop every register + ret, and push handler which emulates the push in the preserved context.



I obviously need to deobfuscate every handler, and somehow teach my program to udnerstand which instruction is valid and then make it decide if its a push pop call or other thing... but i didnt do such stuff, it sounds sooo complex to make it automatised. Could you guide from that step? I would really love to UV the vmp someday... :)


 


Greetz


Link to comment
Share on other sites

<snip>

I told you previously several times that you need to study books on automation and compiler technology. The area you are trying to discover is 7th sem and 8th sem subjects of computer engineering. Either start with the basics or stop asking stupid questions. Its not like you cannot do this if you aren't a cs student(on the contrary most of the CS students i have met avoids this area due to complexity). Just use the brain and some common sense armed with the power of programming and you will be in agood position to become the new Deathway.

 

 

I obviously need to deobfuscate every handler, and somehow teach my program to udnerstand which instruction is valid and then make it decide if its a push pop call or other thing... but i didnt do such stuff, it sounds sooo complex to make it automatised. Could you guide from that step? I would really love to UV the vmp someday...

The obvious answer to this question is automata theory. Compilers do what you are trying to do. learn the basics. Obviously its way simpler than making a compiler but at very least you will need to study the lexical analyser concepts .

 

Sorry for sounding rude but the questions you have asked have their answers already buried deep inside the books. So take your time and do it the right way. It will save your time and you from going through the unnecessary trouble of asking questions and pmng people on the forum. 

 

Best of luck. 

Link to comment
Share on other sites

  • 1 month later...

Perhaps It was about time I should share my tool with you guys. This is a Debugger and Devirtualizer for VMP virtualized code. Notice that When I mean devirtualizer, I mean it shows what machine instructions it executes (not the actual x86 original code). Allows you to debug and place breakpoints. Please try it, and if you like it, please develop it further.


 


 


Confused? Read the intro.txt file and try to follow the example. 


VMPDBG_0_1_0_SRC.zip

  • Like 9
Link to comment
Share on other sites

f982993327.jpg


 


 


Bleebble Blabble Haz Debuggz.


 


Now: I do have a few hints though, If thy wishes to recover thy original x86 code, thy must either:


 


-> Realize a pattern of machine opcodes for a particular x86 intruction (and replace it back)


-> Or Apply compiler optimization techniques on this new machine (the vmp machine). 


Edited by xSRTsect
  • Like 4
Link to comment
Share on other sites

f982993327.jpg

 

 

Bleebble Blabble Haz Debuggz.

 

Now: I do have a few hints though, If thy wishes to recover thy original x86 code, thy must either:

 

-> Realize a pattern of machine opcodes for a particular x86 intruction (and replace it back)

-> Or Apply compiler optimization techniques on this new machine (the vmp machine). 

Wonderful job:) but moving from one challenge into another is quite exhausting

Link to comment
Share on other sites

Actually, I am not sure how deathway devirtualized the themida machine, I myself have never reversed Themida - but the problem here is that the machine being emulated is not a x86 machine is a stack based custom machine, wich means that each instruction does not corrspond to a original x86 machine but to a "random" stack machine OP (Not so random because the final semantics have to be preserved. But if this is the case with themida and deathway managed to pull it off, then he is a 0x1337 haxx0r. My regards.


  • Like 1
Link to comment
Share on other sites

Wow the ammount of knowledge u posses is huge! I wish i could do themida unvirtualizer someday... :)


 


i downloaded the tool, but im kinda confused which values to fill in ( i know the VM EP, the ESI value with pcode and handlers table) but the values seem not to work, can u tell which values did u fill to have this result on screenshot?


 


Edit: Oh it is RVA and not VA :) hats off to you great job man!


Edited by Pancake
Link to comment
Share on other sites

Now you can try to follow MistHill's analysis on the results and keep track on the registers so that you can see the evolution of the variables. Now, It is very important if someone can TLDr-me how Themida machine works; is this also a Stack based machine? If so, are you positive that DeathWays decompiler is actually a compiler: Contains a parser *and* built in optimizer?


 


Also the GUI is not that awesome, anyone is free to improve it ! (and other features are kinda derp) - but remember that this was a tool just for me


Edited by xSRTsect
Link to comment
Share on other sites

I have noticed a bug on decoding instruction "PUSH_32" and "PUSH_16" I'll have to check that out deeper, because sometimes it is actually pushing the stack value, and not a int32/16 hardcoded value.


 


Notice this part of the code:



40e5d2|PUSH_32 100
40e5cf|PUSH_32 8
40e5cd|PUSH_32 STACK
40e5cc|ADD N2, N1
40e5cb|POP_32 R7
40e5c9|PUSH_32 c
40e5c7|PUSH_32 STACK
40e5c6|ADD N2, N1
40e5c5|POP_32 R7
40e5c3|NOR_32 N2, N1 ;CF
40e5c2|POP_32 Rb
40e5c0|ADD N2, N1
40e5bf|POP_32 Rb
40e5bd|PUSH_32 STACK
40e5bc|FETCH32 N1
40e5bb|NOR_32 N2, N1 ;CF
40e5ba|POP_32 Rb
40e5b8|MOV STACK, N1 

The semantics are more or less like ~(~(STACK | STACK)+100) which is equivalent to sub STACK, 100.



40e5b7|PUSH_32 9
40e5b5|POP_32 Rb

For mov Rb, 9.



40e5b3|PUSH_32 666
40e5b0|POP_32 R7

mov R7, 666



40e5ae|PUSH_32 Rd
40e5ac|PUSH_32 STACK
40e5ab|FETCH32 N1 ; STACK = RD, RD
40e5aa|NOR_32 N2, N1 ;CF;
40e5a9|POP_32 R9; STACK = ~RD
40e5a7|PUSH_32 Rd ; STACK = RD, ~RD
40e5a5|PUSH_32 STACK
40e5a4|FETCH32 N1; RD, RD , ~RD
40e5a3|NOR_32 N2, N1 ;CF
40e5a2|POP_32 Ra ; ~RD, ~RD
40e5a0|NOR_32 N2, N1 ;CF;
40e59f|POP_32 R9; R9 = ~RD; STACK: ~RD
40e59d|PUSH_32 Rd; STACK RD, ~RD
40e59b|PUSH_32 STACK
40e59a|FETCH32 N1; RD, RD, ~RD
40e599|NOR_32 N2, N1 ;CF
40e598|POP_32 R9; ~RD, ~RD
40e596|NOR_32 N2, N1 ;CF
40e595|POP_32 Ra
40e593|POP_32 R9; R9=RD
40e591|PUSH_32 R9;
40e58f|PUSH_32 STACK
40e58e|FETCH32 N1
40e58d|NOR_32 N2, N1 ;CF
40e58c|POP_32 Ra; R9,R9
40e58a|PUSH_32 Rc
40e588|PUSH_32 STACK
40e587|FETCH32 N1
40e586|NOR_32 N2, N1 ;CF
40e585|POP_32 R2; RC,RC,R9,R9
40e583|NOR_32 N2, N1 ;CF
40e582|POP_32 R2
40e580|POP_32 R2; R2 = ~RC

Also, this appears as some sort of twisted way to emulate the next two instructions: mov R9, 0 and 


mov R2, 0 .

 

Next we have :

 



40e57e|PUSH_32 Rb
40e57c|PUSH_32 R7
40e57a|PUSH_32 R9
40e578|IDIV N3

40e577|POP_32 Rf


40e575|POP_32 Rd


40e573|POP_32 R5


 


Which emulates the division instruction:


 


Rd = 666 / 9


R5 = 666 % 9

 


Also I have found this pretty sick way of emulating the rol instruction:



40e7fc|PUSH_16 3
40e7fa|PUSH_32 R4
40e7f8|PUSH_32 STACK
40e7f7|FETCH32 N1
40e7f6|SHR_64 <N1,N2>, BYTE N3

All in all I think there are a few instruction sequences that can be represented by simpler ones in order to reduce the obfuscation mass, for instance this instruction sequence appears a lot of times:



40e5ae|PUSH_32 Rd
40e5ac|PUSH_32 STACK
40e5ab|FETCH32 N1 ; STACK = RD, RD
40e5aa|NOR_32 N2, N1 ;CF;
40e5a9|POP_32 R9; STACK = ~RD

which is just a way the machine has to negate the principal operand, N1. So I think that a better engine may kill a lot of the unecessary obfu. 


Edited by xSRTsect
Link to comment
Share on other sites

  • 2 months later...
  • 1 year later...

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...