Jump to content
Tuts 4 You
  • 0
Sign in to follow this  
HellSpider

[DevirtualizeMe] Themida 2.4.6.0

Question

HellSpider

Difficulty : 8
Language : C/C++
Platform : Windows 32-bit and 64-bit
OS Version : All
Packer / Protector : Themida 2.4.6.0

Description :

The objective is to interpret and reconstruct 3 procedures in each file that have been virtualized.
No additional options have been used.
The virtualized functions will execute when keys '1', '2' and '3' are pressed, respectively.

1 = WHITE
2 = RED
3 = BLACK

Only one "brand" of VM has been used per file.
I will upload additional ones when current challenges have been solved or seriously attempted.

Detailed information of the interpreting procedure/internals or a complete solution paper is preferable.

I will post similar challenges for other protectors if someone supplies me with a recent version (VMProtect, Enigma ...).

Accepted solutions:

FISH32        : @koolk
TIGER32     : @koolk
DOLPHIN32 : @koolk
FISH64        : @koolk
TIGER64     : @koolk

Files:

devirtualizeme_tmd_2.4.6.0_fish32.rar
devirtualizeme_tmd_2.4.6.0_tiger32.rar
devirtualizeme_tmd_2.4.6.0_dolphin32.rar
devirtualizeme_tmd_2.4.6.0_fish64.rar
devirtualizeme_tmd_2.4.6.0_tiger64.rar

Screenshot :

unpackme_fish32_2017-05-17_14-20-46.png.52056c2f8ac87b4d9566ca1454e2e2e4.png

Edited by HellSpider (see edit history)

Share this post


Link to post

22 answers to this question

Recommended Posts

  • 7
koolk

Haven't touched this project for a long time. So I worked this weekend on updating the script and catching up with all the changes that they did in the last 1-2 years.

Everything works right now except for TIGER. They added a new weird "push" handler, which is very different from any other TIGER handler. (the offset for the push isn't from a parameter, but from a call to another function that return an internal state value, usually that internal state value is used with a parameter to get the wanted real value, but this time it is used just with a constant number... in your binary for example one such handler is at 0x0562AC9). Nothing too bad, but I ran out of time for this weekend. I will do it during this week and update this comment with the devirtualized tiger when it is done.

Except for that most of the changes were small. Some of them are fixing bugged handlers, other are adding some small protection templates to the handlers. One change that they did was not reseting the state when re-entering the vm after external instruction execution. (instructions that they don't virtualize).

Another change was changing the start of the vm. Until now the start of the vm was something like that: (They push all the registers to the stack before they enter the vm)

pop VM_REG_1
pop VM_REG_2
pop VM_REG_3
..

They changed it to: (in a random order)

mov VM_REG_1, [esp]
mov VM_REG_2, [esp+4]
mov VM_REG_3, [esp+8]
...
add esp, ...

Another change is obfuscating the ending of some of the FISH and TIGER handlers.

The FISH(32/64) BLACK is probably the most annoying vm. since the handlers are heavily obfuscated, with fake conditional jumps and all of that shit. One big handler can be 100000+ instructions. So even a small bug when handling it can fornication up everything. It is probably the safest vm because of that but also really really slow.

oh, and in 64-bit my compiled devirtualized code isn't the same size as the original code, I am not sure why is that, which of the compiled opcodes take more space than the original . But I still had enough space for the devirtualized code in the original address because of the surrounding macros. 

devirtualizeme_tmd_2.4.6.0_fish32.devirtualize.clean.exe.7z

devirtualizeme_tmd_2.4.6.0_fish64.devirtualize.clean.exe.7z

  • Like 11

Share this post


Link to post
  • 1
koolk
49 minutes ago, HellSpider said:

Splendid. I don't think additional challenges are necessary. :)

Thank you for all those challenges. Maybe it is time to try one of your other challenges. (so far I worked on VMProtect for short time about two years ago but lost my interest, the most challenging aspect of their vm is the obfuscation, beside that it is pretty similar to the good old CISC vm, but with better obfuscation, maybe I will tackle that issue one day).

 

And just to clarify about the differences between themida VMs (FISH/TIGER/DOLPHIN), for the case that I gave the wrong impression about them, they share the same engine, but each VM has a unique aspect. For example FISH has big handlers that can do multiple operations with different arguments , and the flow of what to do is controlled by the parameters of the opcode while TIGER has a small handler for each possible combination. Each VM also has some unique "protections" (things that are making it harder to deal with them, the new thing in TIGER that I talked about in the first post is a good example for a protection that is unique for one VM). So solving one VM doesn't solve the others. but big part of the work can be relevant to the other VMs too, but they will still provide some new challenge.

Edited by koolk (see edit history)
  • Like 6

Share this post


Link to post
  • 0
HellSpider
8 hours ago, koolk said:

Haven't touched this project for a long time. So I worked this weekend on updating the script and catching up with all the changes that they did in the last 1-2 years.

Everything works right now except for TIGER. They added a new weird "push" handler, which is very different from any other TIGER handler. (the offset for the push isn't from a parameter, but from a call to another function that return an internal state value, usually that internal state value is used with a parameter to get the wanted real value, but this time it is used just with a constant number... in your binary for example one such handler is at 0x0562AC9). Nothing too bad, but I ran out of time for this weekend. I will do it during this week and update this comment with the devirtualized tiger when it is done.

Except for that most of the changes were small. Some of them are fixing bugged handlers, other are adding some small protection templates to the handlers. One change that they did was not reseting the state when re-entering the vm after external instruction execution. (instructions that they don't virtualize).

Another change was changing the start of the vm. Until now the start of the vm was something like that: (They push all the registers to the stack before they enter the vm)


pop VM_REG_1
pop VM_REG_2
pop VM_REG_3
..

They changed it to: (in a random order)


mov VM_REG_1, [esp]
mov VM_REG_2, [esp+4]
mov VM_REG_3, [esp+8]
...
add esp, ...

Another change is obfuscating the ending of some of the FISH and TIGER handlers.

The FISH(32/64) BLACK is probably the most annoying vm. since the handlers are heavily obfuscated, with fake conditional jumps and all of that shit. One big handler can be 100000+ instructions. So even a small bug when handling it can fornication up everything. It is probably the safest vm because of that but also really really slow.

oh, and in 64-bit my compiled devirtualized code isn't the same size as the original code, I am not sure why is that, which of the compiled opcodes take more space than the original . But I still had enough space for the devirtualized code in the original address because of the surrounding macros. 

devirtualizeme_tmd_2.4.6.0_fish32.devirtualize.clean.exe.7z

devirtualizeme_tmd_2.4.6.0_fish64.devirtualize.clean.exe.7z

Extremely impressive, once again.

You have managed to reconstruct the code flow 100% in both files. Same instructions, same registers and everything in the same order as the original binaries.

The 64-bit code is perfectly fine. The size might be confusing as the start and end macros have been compiled differently in the 64-bit file. The usual 0x12 byte macro is instead a 0x5 sized call to the macro specifier block.
As an example function #1:

VM_START

0000000140001D30 | 48 81 EC 48 04 00 00     | sub rsp,448                             |
0000000140001D37 | E8 40 03 00 00           | call <unpackme_fish64.sub_14000207C>    | VM_START
0000000140001D3C | 48 8D 94 24 50 04 00 00  | lea rdx,qword ptr ss:[rsp+450]          | 

VM_END

0000000140001E03 | FF 15 7F D4 00 00        | call qword ptr ds:[<&MessageBoxW>]      |
0000000140001E09 | E8 88 02 00 00           | call <unpackme_fish64.sub_140002096>    | VM_END
0000000140001E0E | 48 81 C4 48 04 00 00     | add rsp,448                             |


I will post additional challenges in the near future. :)

Share this post


Link to post
  • 0
nek0

I do not know if it is the right place but with my little research I can say that there is no difference of VMs between CV and Themida. Anyway if you want I can send to you  some protected examples, HellSpider.

Edited by nek0 (see edit history)

Share this post


Link to post
  • 0
HellSpider
7 minutes ago, nek0 said:

I do not know if it is the right place but with my little research I can say that there is no difference of VMs between CV and Themida. Anyway if you want I can send to you  some protected examples, HellSpider.

As far as I know, the VMs are identical in general workings.
I think the only difference is that CV has optional compression-decompression blocks before VM entries and missing anti-dump VM bytecode.

Share this post


Link to post
  • 0
koolk

 

On 7/9/2017 at 11:44 PM, HellSpider said:

Extremely impressive, once again.

You have managed to reconstruct the code flow 100% in both files. Same instructions, same registers and everything in the same order as the original binaries.

The 64-bit code is perfectly fine. The size might be confusing as the start and end macros have been compiled differently in the 64-bit file. The usual 0x12 byte macro is instead a 0x5 sized call to the macro specifier block.
As an example function #1:

VM_START


0000000140001D30 | 48 81 EC 48 04 00 00     | sub rsp,448                             |
0000000140001D37 | E8 40 03 00 00           | call <unpackme_fish64.sub_14000207C>    | VM_START
0000000140001D3C | 48 8D 94 24 50 04 00 00  | lea rdx,qword ptr ss:[rsp+450]          | 

VM_END


0000000140001E03 | FF 15 7F D4 00 00        | call qword ptr ds:[<&MessageBoxW>]      |
0000000140001E09 | E8 88 02 00 00           | call <unpackme_fish64.sub_140002096>    | VM_END
0000000140001E0E | 48 81 C4 48 04 00 00     | add rsp,448                             |


I will post additional challenges in the near future. :)

Oh, I forgot about the function macros. In code virtualizer I still use the regular macros, even in 64 bit.

And TIGER is ready and attached.  (And this time I removed all the themida sections, I missed one in the previous binaries)

devirtualizeme_tmd_2.4.6.0_tiger32.devirtualize.clean.exe.7z

  • Like 2

Share this post


Link to post
  • 0
HellSpider
11 hours ago, koolk said:

 

Oh, I forgot about the function macros. In code virtualizer I still use the regular macros, even in 64 bit.

And TIGER is ready and attached.  (And this time I removed all the themida sections, I missed one in the previous binaries)

devirtualizeme_tmd_2.4.6.0_tiger32.devirtualize.clean.exe.7z

Flawless solution. Any special preference for next VM?

Share this post


Link to post
  • 0
koolk
8 hours ago, HellSpider said:

Flawless solution. Any special preference for next VM?

Well now it doesn't really matter. Except for maybe few bugs that may show up I hope that my script would handle anything.

 

Oh, and just a note about the anti-dump. It is mostly virtualized obfuscated assembly, but after decompiling it is simple to automatically detect it and get rid of it. So it isn't really a special vm bytecode (except for one handler that I only saw used in it)

Edited by koolk (see edit history)

Share this post


Link to post
  • 0
Deathway

Impressive work koolk!, just curious, which tools are you using for analysis, scriptting and debugging? 

D.

 

 

Edited by Deathway (see edit history)

Share this post


Link to post
  • 0
koolk
1 hour ago, Deathway said:

Impressive work koolk!, just curious, which tools are you using for analysis, scriptting and debugging? 

D.

 

 

Mostly python. I created a basic python library for assembling/disassembling that uses yasm and udis86. Based on that I wrote all the code for the devirtualizer, which is 100% static analysis.

It uses generic interface that just requires reading/writing from/to virtual address. So far I implemented this interface for the python libraries pefile (for static analysis PE files) and winappdbg (for debugging/dynamic analysis), and even for regular buffer in the memory. I even implemented it for IDAPython (which mean that I can devirtualize functions while in IDA..).

Edited by koolk (see edit history)
  • Like 5

Share this post


Link to post
  • 0
HellSpider

DOLPHIN32 and TIGER64 added.

Share this post


Link to post
  • 0
deepzero

Impressive work kolk! If you dont mind me asking, what other python libraries did you use? What IR, if any?

Share this post


Link to post
  • 0
OutSide
On 7/14/2017 at 2:09 AM, koolk said:

Mostly python. I created a basic python library for assembling/disassembling that uses yasm and udis86. Based on that I wrote all the code for the devirtualizer, which is 100% static analysis.

It uses generic interface that just requires reading/writing from/to virtual address. So far I implemented this interface for the python libraries pefile (for static analysis PE files) and winappdbg (for debugging/dynamic analysis), and even for regular buffer in the memory. I even implemented it for IDAPython (which mean that I can devirtualize functions while in IDA..).

How many time it take for you to implement x64 engine, after x86 was done? 

Share this post


Link to post
  • 0
root
On 9/7/2017 at 2:19 PM, koolk said:

Haven't touched this project for a long time. So I worked this weekend on updating the script and catching up with all the changes that they did in the last 1-2 years.

Everything works right now except for TIGER. They added a new weird "push" handler, which is very different from any other TIGER handler. (the offset for the push isn't from a parameter, but from a call to another function that return an internal state value, usually that internal state value is used with a parameter to get the wanted real value, but this time it is used just with a constant number... in your binary for example one such handler is at 0x0562AC9). Nothing too bad, but I ran out of time for this weekend. I will do it during this week and update this comment with the devirtualized tiger when it is done.

Except for that most of the changes were small. Some of them are fixing bugged handlers, other are adding some small protection templates to the handlers. One change that they did was not reseting the state when re-entering the vm after external instruction execution. (instructions that they don't virtualize).

Another change was changing the start of the vm. Until now the start of the vm was something like that: (They push all the registers to the stack before they enter the vm)


pop VM_REG_1
pop VM_REG_2
pop VM_REG_3
..

They changed it to: (in a random order)


mov VM_REG_1, [esp]
mov VM_REG_2, [esp+4]
mov VM_REG_3, [esp+8]
...
add esp, ...

Another change is obfuscating the ending of some of the FISH and TIGER handlers.

The FISH(32/64) BLACK is probably the most annoying vm. since the handlers are heavily obfuscated, with fake conditional jumps and all of that shit. One big handler can be 100000+ instructions. So even a small bug when handling it can fornication up everything. It is probably the safest vm because of that but also really really slow.

oh, and in 64-bit my compiled devirtualized code isn't the same size as the original code, I am not sure why is that, which of the compiled opcodes take more space than the original . But I still had enough space for the devirtualized code in the original address because of the surrounding macros. 

devirtualizeme_tmd_2.4.6.0_fish32.devirtualize.clean.exe.7z

devirtualizeme_tmd_2.4.6.0_fish64.devirtualize.clean.exe.7z

Big work.

Some questions:
Static Analysis:
       * How did you disassemble the code (recursively, linear or ...)
       * How you rebuilt and reduced the control flow graph (opaque predicates etc ...).
       * How did you optimize your code (PeepHole, Constant Folding etc ..)
       * How did you solve the handlers for the exchange of register (in the vm tiger for example) and the exchange of registers within the handlers?


Hello and thanks ;)

  • Like 1

Share this post


Link to post
  • 0
root

In my spare time I update my program,
Analyzing the Devirtualizeme_2.4.6.0_tiger32 file I saw that the Register, in the portion of code that I highlighted(macro start at address:40C89A) , is Edi instead of Eax.In Original code is Edi or Eax (The program works well in both cases)
For other procedures (x86-x64), I've scanned ,the code is the same.

Immagine.png

  • Like 4

Share this post


Link to post
  • 0
miraculix

Looks like a great tool root. Can  you talk about how you do deobfuscation of the assembly?

 

Like how far do you go? Are peephole optimizations/pattern recognition enough or do you have written a full fledged optimizing compiler?

Share this post


Link to post
  • 0
root

Hi, I'm beginning not to ask for the program because I will not make it public, I do not want to harm anybody.
Instead I will release the source code of the deobfuscator as soon as I have time to fix some points.

@miraculix

The deobfuser completely rebuilds the CFG (remove fake Jcc, Opaque Predicates etc .. etc ..) apply PeepHole (pattern recognition) remove DeadCode and Constant Folding and call analyzer and more.
Thanks to the suggestions of @fvrmatteo I could try different peepHole solutions than the pattern recognition but the result was never as efficient as the use of pattern recognition so I use this solution at the moment (I reverse, not a conference at MIT code needs to work well .. hahahaha).
I only use Pascal. As a disassembler engine use Capstone and as Emulator (for small portions of code) use Unicorn Engine.Not use Virtual Machines Symbolic Execution Phyton script etc .. etc ..
Place a small video to give an idea.

deob.rar

  • Like 5

Share this post


Link to post
  • 0
OutSide

@root What a sense to make it public? This will force oreans to remake their stuff, like it was with deathway plugin -> new VMs 

Edited by OutSide (see edit history)

Share this post


Link to post
  • 0
root
8 hours ago, OutSide said:

@root What a sense to make it public? This will force oreans to remake their stuff, like it was with deathway plugin -> new VMs 

I do not release the decoder but the code optimizer (not immediately), this is not specific to the oream vm, it is only far more effective than others.
What do you say about angr or miasm or optimice or codedoctor ?? do we eliminate them all the tools for binary code analysis ??
I do not issue the decoder code because my hobby is a hobby and I do not want to give anybody a damn but reversing is sharing (I unfortunately belong to the old old reverser school).
If I spoke good English I would probably share a lot more info and would not like others who just write for self-celebration.
Do you know Scherzo or Softworm ??

I'm an old man who now deals with reversing and my only good luck is that the day they will all program in python or javascript I will not be there anymore..hahahahaha

  • Like 4

Share this post


Link to post
  • -9
r00t_H@ck3r

oh wow it has not been solved ?

where has deathway disappear to anyway :(

Edited by r00t_H@ck3r (see edit history)

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  
×
×
  • Create New...