Conquest Posted June 28, 2015 Posted June 28, 2015 Thats a hell of a list. Thanks Today i spent a lot of time on it and i must say i picked up a lot about that VM software. Unvirtualized and deobfuscated handler responsible for call instruction, overwrote handler addresses and indeed it worked I can say that I see general patterns here. I created the sniffer in DLL (inline hook in main VM handler). I want to go forward, so thats a little question to all the great people here Now i must somehow undertand the PCode pointed by ESI. There are 0x100 entries in the handlers table. As far as i understand, each is responsible for handling particural instruction, like in my case handlers[0x1D] was responsible for CALL sprintf, CALL MessageBoxA and "VM EXIT" (maybe its jut RET and not CALL i guess?). At the entry point of the 0x1D handler the call address is already on the stack, so most probably handler before prepared the call address. There are a lof ot 0x20 handlers. I add a small log of the handlers id and pcode (licznik = counter in my lang). Thanks everyone again for such quick answers! Day ago i was terrified when i saw a VM Now im... a boss So now im goin back to work I think i will check the handlers before invoking the msgbox. I also wonder where he stores the context (registers, flags), that will be my second task. Licznik 1020 Handler address 405E50 Handler ID 15 PCode 47 Licznik 1021 Handler address 404A4D Handler ID 6E PCode F4 Licznik 1022 Handler address 4061C2 Handler ID BF PCode 97 Licznik 1023 Handler address 405E50 Handler ID 15 PCode 19 Licznik 1024 Handler address 4061C2 Handler ID BF PCode 67 Licznik 1025 Handler address 404EB5 Handler ID 17 PCode 23 Licznik 1026 Handler address 404430 Handler ID 47 PCode CD Licznik 1027 Handler address 40134E Handler ID 61 PCode 9C Licznik 1028 Handler address 404A4D Handler ID 5E PCode 1C Licznik 1029 Handler address 404A4D Handler ID 5E PCode D5 Licznik 1030 Handler address 4061C2 Handler ID BF PCode A6 Licznik 1031 Handler address 4061C2 Handler ID EA PCode 34 Licznik 1032 Handler address 4061C2 Handler ID 20 PCode 60 Licznik 1033 Handler address 4061C2 Handler ID 20 PCode 70 Licznik 1034 Handler address 4061C2 Handler ID 20 PCode A4 Licznik 1035 Handler address 4061C2 Handler ID 20 PCode 7C Licznik 1036 Handler address 4061C2 Handler ID 20 PCode A0 Licznik 1037 Handler address 4061C2 Handler ID 20 PCode 5C Licznik 1038 Handler address 4061C2 Handler ID BF PCode 0F Licznik 1039 Handler address 4061C2 Handler ID 20 PCode 07 Licznik 1040 Handler address 4061C2 Handler ID 20 PCode 43 Licznik 1041 Handler address 4052AB Handler ID 1D PCode C9 <-- calls MessageBoxA The C9 pcode is the return handler. VMP do the jumps using push ret way . for calls its push ,push ,ret. etc. 1 thing i like to mention in this regard, the truth about "original codes gets lost" lies in the fact that cisc code will translate into small risc handler code like you can see from the call code example. check the vmp reversing documents , you will understand what i mean. for complete restoration , pattern based search and replacement is best way to do so (check deathway's plugin . great example )
Pancake Posted July 15, 2015 Posted July 15, 2015 I spent some time on VMProtect (im still newbie in VMs tho). I found out where it stores the context (registers, flags), i also see the flow of main VM handler which gets the encrypted VM handlers addresses from table, the ESI pointing to pcode backwards, but where i am gettin stuck is how can i find out what every handler does? I got log of handlers execution from the beginning to the end, but how can i say which handler is doin what? I actually manually copied and deobfuscated the return handler where i clearly see pop every register + ret, and push handler which emulates the push in the preserved context. I obviously need to deobfuscate every handler, and somehow teach my program to udnerstand which instruction is valid and then make it decide if its a push pop call or other thing... but i didnt do such stuff, it sounds sooo complex to make it automatised. Could you guide from that step? I would really love to UV the vmp someday... Greetz
Conquest Posted July 16, 2015 Posted July 16, 2015 <snip>I told you previously several times that you need to study books on automation and compiler technology. The area you are trying to discover is 7th sem and 8th sem subjects of computer engineering. Either start with the basics or stop asking stupid questions. Its not like you cannot do this if you aren't a cs student(on the contrary most of the CS students i have met avoids this area due to complexity). Just use the brain and some common sense armed with the power of programming and you will be in agood position to become the new Deathway. I obviously need to deobfuscate every handler, and somehow teach my program to udnerstand which instruction is valid and then make it decide if its a push pop call or other thing... but i didnt do such stuff, it sounds sooo complex to make it automatised. Could you guide from that step? I would really love to UV the vmp someday...The obvious answer to this question is automata theory. Compilers do what you are trying to do. learn the basics. Obviously its way simpler than making a compiler but at very least you will need to study the lexical analyser concepts . Sorry for sounding rude but the questions you have asked have their answers already buried deep inside the books. So take your time and do it the right way. It will save your time and you from going through the unnecessary trouble of asking questions and pmng people on the forum. Best of luck.
xSRTsect Posted August 22, 2015 Posted August 22, 2015 Perhaps It was about time I should share my tool with you guys. This is a Debugger and Devirtualizer for VMP virtualized code. Notice that When I mean devirtualizer, I mean it shows what machine instructions it executes (not the actual x86 original code). Allows you to debug and place breakpoints. Please try it, and if you like it, please develop it further. Confused? Read the intro.txt file and try to follow the example. VMPDBG_0_1_0_SRC.zip 9
xSRTsect Posted August 22, 2015 Posted August 22, 2015 (edited) Bleebble Blabble Haz Debuggz. Now: I do have a few hints though, If thy wishes to recover thy original x86 code, thy must either: -> Realize a pattern of machine opcodes for a particular x86 intruction (and replace it back) -> Or Apply compiler optimization techniques on this new machine (the vmp machine). Edited August 22, 2015 by xSRTsect 4
DMichael Posted August 22, 2015 Posted August 22, 2015 Bleebble Blabble Haz Debuggz. Now: I do have a few hints though, If thy wishes to recover thy original x86 code, thy must either: -> Realize a pattern of machine opcodes for a particular x86 intruction (and replace it back) -> Or Apply compiler optimization techniques on this new machine (the vmp machine). Wonderful job:) but moving from one challenge into another is quite exhausting
xSRTsect Posted August 22, 2015 Posted August 22, 2015 Actually, I am not sure how deathway devirtualized the themida machine, I myself have never reversed Themida - but the problem here is that the machine being emulated is not a x86 machine is a stack based custom machine, wich means that each instruction does not corrspond to a original x86 machine but to a "random" stack machine OP (Not so random because the final semantics have to be preserved. But if this is the case with themida and deathway managed to pull it off, then he is a 0x1337 haxx0r. My regards. 1
Pancake Posted August 22, 2015 Posted August 22, 2015 (edited) Wow the ammount of knowledge u posses is huge! I wish i could do themida unvirtualizer someday... i downloaded the tool, but im kinda confused which values to fill in ( i know the VM EP, the ESI value with pcode and handlers table) but the values seem not to work, can u tell which values did u fill to have this result on screenshot? Edit: Oh it is RVA and not VA hats off to you great job man! Edited August 22, 2015 by Pancake
xSRTsect Posted August 22, 2015 Posted August 22, 2015 (edited) Now you can try to follow MistHill's analysis on the results and keep track on the registers so that you can see the evolution of the variables. Now, It is very important if someone can TLDr-me how Themida machine works; is this also a Stack based machine? If so, are you positive that DeathWays decompiler is actually a compiler: Contains a parser *and* built in optimizer? Also the GUI is not that awesome, anyone is free to improve it ! (and other features are kinda derp) - but remember that this was a tool just for me Edited August 22, 2015 by xSRTsect
xSRTsect Posted August 23, 2015 Posted August 23, 2015 (edited) I have noticed a bug on decoding instruction "PUSH_32" and "PUSH_16" I'll have to check that out deeper, because sometimes it is actually pushing the stack value, and not a int32/16 hardcoded value. Notice this part of the code: 40e5d2|PUSH_32 100 40e5cf|PUSH_32 8 40e5cd|PUSH_32 STACK 40e5cc|ADD N2, N1 40e5cb|POP_32 R7 40e5c9|PUSH_32 c 40e5c7|PUSH_32 STACK 40e5c6|ADD N2, N1 40e5c5|POP_32 R7 40e5c3|NOR_32 N2, N1 ;CF 40e5c2|POP_32 Rb 40e5c0|ADD N2, N1 40e5bf|POP_32 Rb 40e5bd|PUSH_32 STACK 40e5bc|FETCH32 N1 40e5bb|NOR_32 N2, N1 ;CF 40e5ba|POP_32 Rb 40e5b8|MOV STACK, N1 The semantics are more or less like ~(~(STACK | STACK)+100) which is equivalent to sub STACK, 100. 40e5b7|PUSH_32 9 40e5b5|POP_32 RbFor mov Rb, 9. 40e5b3|PUSH_32 666 40e5b0|POP_32 R7mov R7, 666 40e5ae|PUSH_32 Rd 40e5ac|PUSH_32 STACK 40e5ab|FETCH32 N1 ; STACK = RD, RD 40e5aa|NOR_32 N2, N1 ;CF; 40e5a9|POP_32 R9; STACK = ~RD 40e5a7|PUSH_32 Rd ; STACK = RD, ~RD 40e5a5|PUSH_32 STACK 40e5a4|FETCH32 N1; RD, RD , ~RD 40e5a3|NOR_32 N2, N1 ;CF 40e5a2|POP_32 Ra ; ~RD, ~RD 40e5a0|NOR_32 N2, N1 ;CF; 40e59f|POP_32 R9; R9 = ~RD; STACK: ~RD 40e59d|PUSH_32 Rd; STACK RD, ~RD 40e59b|PUSH_32 STACK 40e59a|FETCH32 N1; RD, RD, ~RD 40e599|NOR_32 N2, N1 ;CF 40e598|POP_32 R9; ~RD, ~RD 40e596|NOR_32 N2, N1 ;CF 40e595|POP_32 Ra 40e593|POP_32 R9; R9=RD 40e591|PUSH_32 R9; 40e58f|PUSH_32 STACK 40e58e|FETCH32 N1 40e58d|NOR_32 N2, N1 ;CF 40e58c|POP_32 Ra; R9,R9 40e58a|PUSH_32 Rc 40e588|PUSH_32 STACK 40e587|FETCH32 N1 40e586|NOR_32 N2, N1 ;CF 40e585|POP_32 R2; RC,RC,R9,R9 40e583|NOR_32 N2, N1 ;CF 40e582|POP_32 R2 40e580|POP_32 R2; R2 = ~RCAlso, this appears as some sort of twisted way to emulate the next two instructions: mov R9, 0 and mov R2, 0 . Next we have : 40e57e|PUSH_32 Rb 40e57c|PUSH_32 R7 40e57a|PUSH_32 R9 40e578|IDIV N340e577|POP_32 Rf40e575|POP_32 Rd40e573|POP_32 R5 Which emulates the division instruction: Rd = 666 / 9R5 = 666 % 9 Also I have found this pretty sick way of emulating the rol instruction: 40e7fc|PUSH_16 3 40e7fa|PUSH_32 R4 40e7f8|PUSH_32 STACK 40e7f7|FETCH32 N1 40e7f6|SHR_64 <N1,N2>, BYTE N3All in all I think there are a few instruction sequences that can be represented by simpler ones in order to reduce the obfuscation mass, for instance this instruction sequence appears a lot of times: 40e5ae|PUSH_32 Rd 40e5ac|PUSH_32 STACK 40e5ab|FETCH32 N1 ; STACK = RD, RD 40e5aa|NOR_32 N2, N1 ;CF; 40e5a9|POP_32 R9; STACK = ~RDwhich is just a way the machine has to negate the principal operand, N1. So I think that a better engine may kill a lot of the unecessary obfu. Edited August 23, 2015 by xSRTsect
MistHill Posted August 24, 2015 Posted August 24, 2015 @xSRTsectGood work! Thanks for sharing. Oreans' virtual machine not stack based VM. Each VM has its own Context in a/few memory block(s): vEntry/vExit, vRegs., Handlers..., and thread safe.Please refer to Deathway's "RISC Machine Documentation"
Sound Posted November 7, 2015 Posted November 7, 2015 @xSRTsectGood.Work Thank you sharing。。 @MistHillGreat Work Replyalways learn alot from you and all others.
xSRTsect Posted January 24, 2017 Posted January 24, 2017 Can't you repost this challenge but with x64 version of vmp? (If you are struggling finding such a version pm-me) 1
Sean the hard worker Posted April 18 Posted April 18 On 6/28/2015 at 5:11 AM, Pancake said: Thats a hell of a list. Thanks Today i spent a lot of time on it and i must say i picked up a lot about that VM software. Unvirtualized and deobfuscated handler responsible for call instruction, overwrote handler addresses and indeed it worked I can say that I see general patterns here. I created the sniffer in DLL (inline hook in main VM handler). I want to go forward, so thats a little question to all the great people here Now i must somehow undertand the PCode pointed by ESI. There are 0x100 entries in the handlers table. As far as i understand, each is responsible for handling particural instruction, like in my case handlers[0x1D] was responsible for CALL sprintf, CALL MessageBoxA and "VM EXIT" (maybe its jut RET and not CALL i guess?). At the entry point of the 0x1D handler the call address is already on the stack, so most probably handler before prepared the call address. There are a lof ot 0x20 handlers. I add a small log of the handlers id and pcode (licznik = counter in my lang). Thanks everyone again for such quick answers! Day ago i was terrified when i saw a VM Now im... a boss So now im goin back to work I think i will check the handlers before invoking the msgbox. I also wonder where he stores the context (registers, flags), that will be my second task. Licznik 1020 Handler address 405E50 Handler ID 15 PCode 47 Licznik 1021 Handler address 404A4D Handler ID 6E PCode F4 Licznik 1022 Handler address 4061C2 Handler ID BF PCode 97 Licznik 1023 Handler address 405E50 Handler ID 15 PCode 19 Licznik 1024 Handler address 4061C2 Handler ID BF PCode 67 Licznik 1025 Handler address 404EB5 Handler ID 17 PCode 23 Licznik 1026 Handler address 404430 Handler ID 47 PCode CD Licznik 1027 Handler address 40134E Handler ID 61 PCode 9C Licznik 1028 Handler address 404A4D Handler ID 5E PCode 1C Licznik 1029 Handler address 404A4D Handler ID 5E PCode D5 Licznik 1030 Handler address 4061C2 Handler ID BF PCode A6 Licznik 1031 Handler address 4061C2 Handler ID EA PCode 34 Licznik 1032 Handler address 4061C2 Handler ID 20 PCode 60 Licznik 1033 Handler address 4061C2 Handler ID 20 PCode 70 Licznik 1034 Handler address 4061C2 Handler ID 20 PCode A4 Licznik 1035 Handler address 4061C2 Handler ID 20 PCode 7C Licznik 1036 Handler address 4061C2 Handler ID 20 PCode A0 Licznik 1037 Handler address 4061C2 Handler ID 20 PCode 5C Licznik 1038 Handler address 4061C2 Handler ID BF PCode 0F Licznik 1039 Handler address 4061C2 Handler ID 20 PCode 07 Licznik 1040 Handler address 4061C2 Handler ID 20 PCode 43 Licznik 1041 Handler address 4052AB Handler ID 1D PCode C9 <-- calls MessageBoxA How can we find the PCode "C9" in the debugger? Regards. sean. 1
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now