Jump to content
Tuts 4 You

Automatic deobfuscation using symbolic execution and LLVM - Playing with the Tigress binary protection


Recommended Posts



From the authors webpage :



Tigress Protections

Tigress is a diversifying virtualizer/obfuscator for the C language that supports many novel defenses against both static and dynamic reverse engineering and de-virtualization attacks.

In particular, Tigress protects against static de-virtualization by generating virtual instruction sets of arbitrary complexity and diversity, by producing interpreters with multiple types of instruction dispatch, and by inserting code for anti alias analysis.

Tigress protects against dynamic de-virtualization by merging the real code with bogus functions, by inserting implicit flow, and by creating slowly-executing reenetrant interpreters.

Tigress implements its own version of code packing through the use of runtime code generation. Finally, Tigress' dynamic transformation provides a generalized form of continous runtime code modification.







VMs descriptions :

Tigress team has provided some challenges where we can find different kind of protections

  • VM-0: One level of virtualization, random dispatch.
  • VM-1: One level of virtualization, superoperators, split instruction handlers.
  • VM-2: One level of virtualization, bogus functions, implicit flow.
  • VM-3: One level of virtualization, instruction handlers obfuscated with arithmetic encoding, virtualized function is split and the split parts merged.
  • VM-4: Two levels of virtualization, implicit flow.
  • VM-5: One level of virtualization, one level of jitting, implicit flow.
  • VM-6: Two levels of jitting, implicit flow.




Automatic deobfuscation

Our goals were to:

  • Symbolically extract the hash algorithm
  • Simplify these symbolic expressions
  • Provide python scripts where we can get the hash from a given input and get input collisions from a given hash
  • Provide a new simplified version of the binary

And all of this with only one generic script :). To do so, we made in the following order:

  • Symbolically emulate the obfuscated binary with Triton
  • Concretize everything which are not related to the user input.
  • Extract the hash algorithm and create input->hash and hash->inputs using templates
  • Convert Triton's expressions to the Arybo's expressions
  • Convert Arybo's expressions to the LLVM-IR representation
  • Apply LLVM optimizations (O2)
  • Rebuild a simplified binary version

If you want more information, you can checkout our solve-vm.py script.





Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Create New...