Jump to content
Tuts 4 You

Edit History

DefCon42

DefCon42


clarification

Hey all!

I recently came across this neat paper here: https://tel.archives-ouvertes.fr/tel-01623849/document where they used what they called "Mixed-Boolean Arithmetic" to obfuscate arithmetic expressions, and then showed ways to deobfuscate them.

Looking a the deobfuscation methods, they seemed largely either pattern-based or wouldn't work when bigger numbers were involved. So I thought to myself, "How can I mess with this?"

Well, first things first, they have no concrete method there for creating these expressions. There are two pages total dedicated to the creation of these expressions, so I had to get creative to make it work.

They describe using numpy to solve the matrix equation created and using a hack-y method to circumvent not having a square matrix, but I thought that I could do a bit better...

Enter two painstaking days of learning linear algebra and figuring out exactly what I needed to do. They start by computing the truth tables of some expressions, putting them into a matrix as columns, then solving for the vector that, when using the dot product on the vector and the matrix, returned zero. After that, they filtered out various "rewrite rules" from the matrix generated. You can read more about this in the paper, though there's not much to go off of. They use numpy's linalg.solve to do this, but that only works with square matrices and produced results with constants that were a tad small for my taste :^)

After a bit of research I found a python module called cvxpy, designed to find values that satisfy an expression under certain constraints. Even cooler was that you could specify matrix equations and integer-only solutions, which is exactly what I needed. After tinkering with it for a bit, I was able to reliably create expressions like these (representing a xor b):

-27540 * (~a & b) + 373574 * (~a ^ ~b) + -27541 * (a & ~b) + -27541 * (~a & b) + -11 * (a + b) + -30436 * (~a & ~b) + -30436 * (~a * ~b) + 137712 * (a * ~b) + -27544 * (~a) + 1 * (b) + 3 * (~a + ~b) + -221347 * (~a - ~b) + 13 * (a + b) + -2 * (a) + -30454 * (~a + ~b) + -30454 * (~a + ~b) + -3 * (b) + -30449 * (a | b) + -27546 * (~b)
3672455 * (~a * b) + -362611 * (a ^ b) + 78113 * (a) + -524636 * (~b) + -524636 * (a ^ ~b) + 78113 * (a) + -524636 * (~a | b) + -362611 * (a ^ b) + -959545 * (a | b) + -78113 * (a - b) + -959545 * (~a + ~b) + -524636 * (~a) + 142249 * (a + b) + -959544 * (~a + ~b) + 142249 * (a + b) + -524637 * (a - ~b) + -524637 * (~a) + -524637 * (a & ~b) + 3241246 * (~a ^ ~b)

Using truth tables modulo 4 instead of modulo 2 I was also able to compute equivalencies for multiplication, which was pretty neato.

However, using the same method of computing the truth table and finding an equivalent expression you can reverse this sort of operation. I'll leave that as an exercise to the reader.  ;)

 

EDIT: As a friend of mine pointed out, this will work with any operation that can be reducible to boolean math (i.e. xor, addition, subtraction, multiplication), not just arithmetic operations.

DefCon42

DefCon42


grammar

Hey all!

I recently came across this neat paper here: https://tel.archives-ouvertes.fr/tel-01623849/document where they used what they called "Mixed-Boolean Arithmetic" to obfuscate arithmetic expressions, and then showed ways to deobfuscate them.

Looking a the deobfuscation methods, they seemed largely either pattern-based or wouldn't work when bigger numbers were involved. So I thought to myself, "How can I mess with this?"

Well, first things first, they have no concrete method there for creating these expressions. There are two pages total dedicated to the creation of these expressions, so I had to get creative to make it work.

They describe using numpy to solve the matrix equation created and using a hack-y method to circumvent not having a square matrix, but I thought that I could do a bit better...

Enter two painstaking days of learning linear algebra and figuring out exactly what I needed to do. They start by computing the truth tables of some expressions, putting them into a matrix as columns, then solving for the vector that, when using the dot product on the vector and the matrix, returned zero. After that, they filtered out various "rewrite rules" from the matrix generated. You can read more about this in the paper, though there's not much to go off of. They use numpy's linalg.solve to do this, but that only works with square matrices and produced results with constants that were a tad small for my taste :^)

After a bit of research I found a python module called cvxpy, designed to find values that satisfy an expression under certain constraints. Even cooler was that you could specify matrix equations and integer-only solutions, which is exactly what I needed. After tinkering with it for a bit, I was able to reliably create expressions like these (representing a xor b):

-27540 * (~a & b) + 373574 * (~a ^ ~b) + -27541 * (a & ~b) + -27541 * (~a & b) + -11 * (a + b) + -30436 * (~a & ~b) + -30436 * (~a * ~b) + 137712 * (a * ~b) + -27544 * (~a) + 1 * (b) + 3 * (~a + ~b) + -221347 * (~a - ~b) + 13 * (a + b) + -2 * (a) + -30454 * (~a + ~b) + -30454 * (~a + ~b) + -3 * (b) + -30449 * (a | b) + -27546 * (~b)
3672455 * (~a * b) + -362611 * (a ^ b) + 78113 * (a) + -524636 * (~b) + -524636 * (a ^ ~b) + 78113 * (a) + -524636 * (~a | b) + -362611 * (a ^ b) + -959545 * (a | b) + -78113 * (a - b) + -959545 * (~a + ~b) + -524636 * (~a) + 142249 * (a + b) + -959544 * (~a + ~b) + 142249 * (a + b) + -524637 * (a - ~b) + -524637 * (~a) + -524637 * (a & ~b) + 3241246 * (~a ^ ~b)

Using truth tables modulo 4 instead of modulo 2 I was also able to compute equivalencies for multiplication, which was pretty neato.

However, using the same method of computing the truth table and finding an equivalent expression you can reverse this sort of operation. I'll leave that as an exercise to the reader.  ;)

DefCon42

DefCon42

Hey all!

I recently came across this neat paper here: https://tel.archives-ouvertes.fr/tel-01623849/document where they used what they called "Mixed-Boolean Arithmetic" to obfuscate arithmetic expressions, and then showed ways to deobfuscate them.

Looking a the deobfuscation methods, they seemed largely either pattern-based or wouldn't work when bigger numbers were involved. So I thought to myself, "How can I mess with this?"

Well, first things first, they have no concrete method there for creating these expressions. There are two pages total dedicated to the creation of these expressions, so I had to get creative to make it work.

They describe using numpy to solve the matrix equation created and using a hack-y method to circumvent not having a square matrix, but I thought that I could do a bit better...

Enter two painstaking days of learning linear algebra and figuring out exactly what I needed to do. They start by computing the truth tables of some expressions, putting them into a matrix as columns, then solving for the vector that, when using the dot product on the vector and the matrix, returned zero. After that, they filtered out various "rewrite rules" from the matrix generated. You can read more about this in the paper, though there's not much to go off of. They use numpy's linalg.solve to do this, but that only works with square matrices and produced results with constants that were a tad small for my taste :^)

After a bit of research I found a python module called cvxpy, designed to find values that satisfy an expression under certain constraints. Even cooler was that you could specify matrix equations and integer-only solutions, which is exactly what I needed. After tinkering with it for a bit, I was able to reliably create expressions like these (representing a xor b):

-27540 * (~a & b) + 373574 * (~a ^ ~b) + -27541 * (a & ~b) + -27541 * (~a & b) + -11 * (a + b) + -30436 * (~a & ~b) + -30436 * (~a * ~b) + 137712 * (a * ~b) + -27544 * (~a) + 1 * (b) + 3 * (~a + ~b) + -221347 * (~a - ~b) + 13 * (a + b) + -2 * (a) + -30454 * (~a + ~b) + -30454 * (~a + ~b) + -3 * (b) + -30449 * (a | b) + -27546 * (~b)
3672455 * (~a * b) + -362611 * (a ^ b) + 78113 * (a) + -524636 * (~b) + -524636 * (a ^ ~b) + 78113 * (a) + -524636 * (~a | b) + -362611 * (a ^ b) + -959545 * (a | b) + -78113 * (a - b) + -959545 * (~a + ~b) + -524636 * (~a) + 142249 * (a + b) + -959544 * (~a + ~b) + 142249 * (a + b) + -524637 * (a - ~b) + -524637 * (~a) + -524637 * (a & ~b) + 3241246 * (~a ^ ~b)

Using truth tables modulo 4 instead of modulo 2 I was also able to compute equivalencies for multiplication, which was pretty neato.

Which is pretty neat. However, using the same method of computing the truth table and finding an equivalent expression you can reverse this sort of operation. I'll leave that as an exercise to the reader.  ;)

×
×
  • Create New...