bomblader Posted August 23, 2015 Posted August 23, 2015 (edited) Here you go... have fun! You can use it like this: name.pyc <password>dec.rar Edited August 23, 2015 by bomblader
Extreme Coders Posted August 24, 2015 Posted August 24, 2015 (edited) The file is nothing special. It has been processed by PjOrion It breaks existing disassemblers and decompilers since it introduces junk instructions in the code stream which are never executed. Python unlike java does not have a bytecode verifier, so python never complains unless those are executed. To handle this we need a recursive disassembler like IDA Pro. There is already python processor module(Ch 19), but it does not seem to work for for version 2.7 When I have time, I will come up with a tool which will automatically obfuscate such files Here is a cmd output, showing why decompilers/disassemblers fail. Note the junk instructions. >>> import marshal, dis >>> f = open('1.pyc', 'rb') >>> f.seek(8) >>> co = marshal.load(f) >>> dis.disassemble(co) 1 >> 0 SETUP_EXCEPT 99 (to 102) 3 <144> 387 6 STOP_CODE 7 JUMP_FORWARD 217 (to 227) 10 <157> 44944 13 LOAD_NAME 28929 Traceback (most recent call last): File "<stdin>", line 1, in <module> File "C:\Python27\lib\dis.py", line 97, in disassemble print '(' + co.co_names[oparg] + ')', IndexError: tuple index out of range Edited August 24, 2015 by Extreme Coders 2
bomblader Posted September 4, 2015 Author Posted September 4, 2015 (edited) anyone else? Edited September 4, 2015 by bomblader
Extreme Coders Posted September 12, 2015 Posted September 12, 2015 I have looked at the protection in a bit more detail. It isn't sophisticated, but it introduces good protection, never seen before in python (atleast I never saw) . It uses exception handling to redirect flow. In addition it introduces control flow obfuscation and variable name obfuscation. However there aren't any opaque predicates. For example here is one of the code objects disassembled. 0 SETUP_EXCEPT 99 3 <INVALID> 102 POP_TOP 103 POP_TOP 104 POP_TOP 105 LOAD_CONST 1 108 JUMP_FORWARD 14 125 MAKE_FUNCTION 0 128 JUMP_ABSOLUTE 205 205 STORE_FAST 0 208 JUMP_ABSOLUTE 145 145 SETUP_FINALLY 6 148 JUMP_ABSOLUTE 160 160 JUMP_ABSOLUTE 55 55 LOAD_CONST 2 58 JUMP_FORWARD 16 77 LOAD_CONST 0 80 JUMP_ABSOLUTE 25 25 IMPORT_NAME 0 28 JUMP_FORWARD 34 65 STORE_FAST 1 68 JUMP_ABSOLUTE 33 33 LOAD_GLOBAL 1 36 JUMP_ABSOLUTE 88 88 LOAD_FAST 1 91 JUMP_ABSOLUTE 4 4 CALL_FUNCTION 1 7 JUMP_FORWARD 217 227 POP_TOP 228 JUMP_ABSOLUTE 165 165 LOAD_FAST 0 168 JUMP_ABSOLUTE 135 135 LOAD_FAST 0 138 JUMP_FORWARD 34 175 CALL_FUNCTION 1 178 JUMP_ABSOLUTE 14 14 POP_TOP 15 JUMP_ABSOLUTE 43 43 POP_BLOCK 44 JUMP_ABSOLUTE 213 213 LOAD_CONST 0 216 JUMP_FORWARD 20 239 DELETE_FAST 0 242 JUMP_ABSOLUTE 117 117 END_FINALLY 118 JUMP_ABSOLUTE 189 189 LOAD_CONST 0 192 JUMP_ABSOLUTE 238 238 RETURN_VALUE The first instruction sets up an exception handler. The second instruction is always invalid. This will generate an exception and control will be transferred to 102. From there there we need to disassemble instructions while tracking jumps and transferring control flow as needed. Note that JUMP_FORWARD is a relative jump whereas JUMP_ABSOLUTE is an absolute jump. To make the file decompilable, more effort is needed. From the disassembly, we need to build up a basic block representation of the code. Next, the list of basic blocks needs to be coalesced to remove the unnecessary jump instructions in between. Finally, this intermediate representation must be assembled to generate the deobfuscated code. All instructions offsets must be properly handled. Removing variable naming obfuscation is easy. Just rename all the entries in co_names member of the corresponding object and you are done. 2
bomblader Posted September 13, 2015 Author Posted September 13, 2015 Extreme Coders, is there a way to "debug" the .pyc file and find out exactly the order of execution witout manually checking each op_code?
Extreme Coders Posted September 13, 2015 Posted September 13, 2015 (edited) At the moment, python does not seem to have a bytecode debugger. One of the possible workarounds, is to debug the python vm itself.The relevant code is in the function PyEval_EvalFrameEx in ceval.c The huge switch is the interpreter loop. Another, but much easier way is to build python from source.The LLTRACE flag needs to be defined in the preprocessor definitions.This way python will display all instructions executed by it. A third way is to (ab)use the line number mapping feature of code objects. See here for an implementation If you follow any of the above methods, all you get is an instruction trace for a particular execution.You would never achieve complete code coverage, which is necessary for proper deobfuscation. Edited September 13, 2015 by Extreme Coders 1
Extreme Coders Posted September 14, 2015 Posted September 14, 2015 (edited) Using some graphviz magic, here is the control flow graph of one of the simpler obfuscated code object. In contrast, here is the cfg from a normal file. And here is a one of them splattered with jumps Note that the obfuscated code objects, contains unnecessary jumps between the basic blocks.These jumps need to be removed i.e the basic blocks need to be coalesced. Next, this IR needs to be assembled back to a code object, after which it will hopefully be decompilable. EDIT: Added another sample Edited September 19, 2015 by Extreme Coders 4
Extreme Coders Posted September 22, 2015 Posted September 22, 2015 Just added the cfg simplification module taking some ideas from the llvm project http://llvm.org/docs/doxygen/html/SimplifyCFGPass_8cpp_source.html It can now deobfuscate this --------> Now need to develop an assembler, which can turn this basic blocks into instructions. Hopefully, LLVM will again be of help 3
Extreme Coders Posted September 28, 2015 Posted September 28, 2015 The assembler has been developed, but some more changes need to be done to make it workable (and readable) Currently, it optimises python bytecode even better than the python compiler, as a result decompilers (which are usually pattern matching) have problems. This need to be fixed. Here is the source. import types import opcode import collections import Queue import marshal import pydotplus import cStringIO class BasicBlock: def __init__(self): self.addr = 0 self.predecessors = [] self.successors = [] self.instructions = [] self.refHandlerIns = [] self.isHandler = False self.isEntry = False self.b_seen = False # b_seen is used to perform a DFS of basicblocks def addPredecessor(self, bb): self.predecessors.append(bb) def addSuccessor(self, bb): self.successors.append(bb) def addInstruction(self, ins): self.instructions.append(ins) def blockSize(self): return reduce(lambda size, ins: size + ins.size, self.instructions, 0) class Instruction: def __init__(self, opkode, arg, size): self.opkode = opkode self.arg = arg self.size = size class Disassembler: def __init__(self, code_object): assert isinstance(code_object, types.CodeType) self.c_stream = map(ord, code_object.co_code) def disasAt(self, offset): assert offset < len(self.c_stream) opkode = self.c_stream[offset] # Invalid instruction if opkode not in opcode.opmap.values(): return Instruction(-1, None, 1) if opkode < opcode.HAVE_ARGUMENT: return Instruction(opkode, None, 1) if opkode >= opcode.HAVE_ARGUMENT: arg = (self.c_stream[offset + 2] << 8 ) | self.c_stream[offset + 1] return Instruction(opkode, arg, 3) def isRetIns(ins): return ins.opkode == opcode.opmap['RETURN_VALUE'] def isBranchIns(ins): branchIns = [opcode.opmap[x] for x in [\ 'JUMP_IF_FALSE_OR_POP', \ 'JUMP_IF_TRUE_OR_POP', \ 'JUMP_ABSOLUTE', \ 'POP_JUMP_IF_FALSE',\ 'POP_JUMP_IF_TRUE',\ 'CONTINUE_LOOP',\ 'FOR_ITER',\ 'JUMP_FORWARD',\ ]] return ins.opkode in branchIns def isCondiBranchIns(ins): condiBranchIns = [opcode.opmap[x] for x in [\ 'JUMP_IF_FALSE_OR_POP', \ 'JUMP_IF_TRUE_OR_POP', \ 'POP_JUMP_IF_FALSE',\ 'POP_JUMP_IF_TRUE',\ 'FOR_ITER',\ ]] return ins.opkode in condiBranchIns def isHandlerIns(ins): handlerIns = [opcode.opmap[x] for x in ['SETUP_LOOP', 'SETUP_EXCEPT', 'SETUP_FINALLY', 'SETUP_WITH']] return ins.opkode in handlerIns def getInsCrossRef(ins, addr): targets = [] if ins.opkode == opcode.opmap['JUMP_IF_FALSE_OR_POP']: targets.append(addr + ins.size) targets.append(ins.arg) elif ins.opkode == opcode.opmap['JUMP_IF_TRUE_OR_POP']: targets.append(addr + ins.size) targets.append(ins.arg) elif ins.opkode == opcode.opmap['JUMP_ABSOLUTE']: targets.append(ins.arg) elif ins.opkode == opcode.opmap['POP_JUMP_IF_FALSE']: targets.append(addr + ins.size) targets.append(ins.arg) elif ins.opkode == opcode.opmap['POP_JUMP_IF_TRUE']: targets.append(addr + ins.size) targets.append(ins.arg) elif ins.opkode == opcode.opmap['CONTINUE_LOOP']: targets.append(ins.arg) elif ins.opkode == opcode.opmap['FOR_ITER']: targets.append(addr + ins.size) targets.append(addr + ins.size + ins.arg) elif ins.opkode == opcode.opmap['JUMP_FORWARD']: targets.append(addr + ins.size + ins.arg) elif ins.opkode == opcode.opmap['SETUP_LOOP']: targets.append(addr + ins.size) targets.append(addr + ins.size + ins.arg) elif ins.opkode == opcode.opmap['SETUP_EXCEPT']: targets.append(addr + ins.size) targets.append(addr + ins.size + ins.arg) elif ins.opkode == opcode.opmap['SETUP_FINALLY']: targets.append(addr + ins.size) targets.append(addr + ins.size + ins.arg) elif ins.opkode == opcode.opmap['SETUP_WITH']: targets.append(addr + ins.size) targets.append(addr + ins.size + ins.arg) elif ins.opkode != opcode.opmap['RETURN_VALUE']: targets.append(addr + ins.size) return targets def _leaderSortFunc(elem1, elem2): if elem1.addr != elem2.addr: return elem1.addr - elem2.addr else: if elem1.type == 'S': return -1 else: return 1 def findLeaders(code_object, oep): Leader = collections.namedtuple('leader', ['type', 'addr']) leader_set = set() leader_set.add(Leader('S', oep)) # Queue to contain list of addresses to be analyzed by linear sweep disassembly algorithm analysis_Q = Queue.Queue() analysis_Q.put(oep) analyzed_addresses = set() disassembler = Disassembler(code_object) while not analysis_Q.empty(): addr = analysis_Q.get() while True: ins = disassembler.disasAt(addr) analyzed_addresses.add(addr) # If current instruction is a return, stop disassembling further # current address is an end leader if isRetIns(ins): leader_set.add(Leader('E', addr)) break # If current instruction is braching, stop disassembling further # the current instr is an end leader, branch target is start leader if isBranchIns(ins): leader_set.add(Leader('E', addr)) for target in getInsCrossRef(ins, addr): leader_set.add(Leader('S', target)) if target not in analyzed_addresses: analysis_Q.put(target) break # Current instruction is not branching else: # Get cross refs cross_refs = getInsCrossRef(ins, addr) addr = cross_refs[0] # The immediate next instruction # Some non branching instructions like SETUP_LOOP, # SETUP_EXCEPT can have more than 1 cross references if len(cross_refs) == 2: leader_set.add(Leader('S', cross_refs[1])) if cross_refs[1] not in analyzed_addresses: analysis_Q.put(cross_refs[1]) return sorted(leader_set, cmp = _leaderSortFunc) def buildBasicBlocks(leaders, code_object, entry_addr): i = 0 bb_list = [] disassembler = Disassembler(code_object) while i < len(leaders): leader1, leader2 = leaders[i], leaders[i+1] addr1, addr2 = leader1.addr, leader2.addr bb = BasicBlock() bb_list.append(bb) bb.addr = addr1 offset = 0 if addr1 == entry_addr: bb.isEntry = True if leader1.type == 'S' and leader2.type == 'E': while addr1 + offset <= addr2: ins = disassembler.disasAt(addr1 + offset) bb.addInstruction(ins) offset += ins.size i += 2 elif leader1.type == 'S' and leader2.type == 'S': while addr1 + offset < addr2: ins = disassembler.disasAt(addr1 + offset) bb.addInstruction(ins) offset += ins.size i += 1 return bb_list def insMnemonic(ins): return opcode.opname[ins.opkode] def findbbinBBList(bb_list, bb_addr): for i in range(len(bb_list)): if bb_list[i].addr == bb_addr: return i raise Exception("No basic block with an address {} exists!!".format(bb_addr)) # Should not happen def buildPositionIndepedentBasicBlock(bb_list): for bb in bb_list: offset = 0 for i in range(len(bb.instructions)): ins = bb.instructions[i] # The last ins of a bb is processed specially if i == len(bb.instructions) - 1: cross_ref = getInsCrossRef(ins, bb.addr + offset) if isBranchIns(ins): # Conditional branch ins have 2 cross refs if isCondiBranchIns(ins): # ref1 is the address of next instruction # ref2 is the address of the branch target ref1, ref2 = cross_ref[0], cross_ref[1] pos = findbbinBBList(bb_list, ref2) ins.arg = bb_list[pos] bb.addSuccessor(bb_list[pos]) bb_list[pos].addPredecessor(bb) pos = findbbinBBList(bb_list, ref1) bb.addSuccessor(bb_list[pos]) bb_list[pos].addPredecessor(bb) # Unconditional branch ins have 1 cross ref else: ref = cross_ref[0] pos = findbbinBBList(bb_list, ref) ins.arg = bb_list[pos] bb.addSuccessor(bb_list[pos]) bb_list[pos].addPredecessor(bb) # FOR_ITER, SETUP_LOOP, SETUP_EXCEPT, SETUP_FINALLY, SETUP_WITH # They have 2 cross refs elif isHandlerIns(ins): # ref1 is the address of next instruction # ref2 is the address of the handler ref1, ref2 = cross_ref[0], cross_ref[1] pos = findbbinBBList(bb_list, ref2) bb_list[pos].isHandler = True bb_list[pos].refHandlerIns.append(ins) ins.arg = bb_list[pos] pos = findbbinBBList(bb_list, ref1) bb.addSuccessor(bb_list[pos]) bb_list[pos].addPredecessor(bb) # For RETURN_VALUE instruction, nothing to do elif isRetIns(ins): pass # Normal instructions, have only 1 cross ref else: ref = cross_ref[0] pos = findbbinBBList(bb_list, ref) bb.addSuccessor(bb_list[pos]) bb_list[pos].addPredecessor(bb) # Not the last instruction else: if isHandlerIns(ins): ref = getInsCrossRef(ins, bb.addr + offset)[1] pos = findbbinBBList(bb_list, ref) bb_list[pos].isHandler = True bb_list[pos].refHandlerIns.append(ins) ins.arg = bb_list[pos] offset += ins.size def findOEP(code_object): ''' Finds the original entry point of a code object obfuscated by PjOrion. DO NOT call this for a non obfsucated code object. :param code_object: the code object :type code_object: code :returns: the entrypoint :rtype: int ''' disassembler = Disassembler(code_object) ins = disassembler.disasAt(0) try: assert insMnemonic(ins) == 'SETUP_EXCEPT' except_handler = 0 + ins.arg + ins.size assert disassembler.disasAt(3).opkode == -1 assert insMnemonic(disassembler.disasAt(except_handler)) == 'POP_TOP' assert insMnemonic(disassembler.disasAt(except_handler + 1)) == 'POP_TOP' assert insMnemonic(disassembler.disasAt(except_handler + 2)) == 'POP_TOP' return except_handler + 3 except: return -1 def simplifyPass1(bb_list): """ Eliminates a basic block that only contains an unconditional branch. """ foo = True while foo: foo = False for i in range(len(bb_list)): bb = bb_list[i] if bb.isHandler and len(bb.instructions) == 1: ins = bb.instructions[0] if insMnemonic(ins) == 'JUMP_FORWARD' or insMnemonic(ins) == 'JUMP_ABSOLUTE': branch_target_bb = bb.successors[0] # Branch target of this basic block branch_target_bb.predecessors.remove(bb) branch_target_bb.isHandler = True for refIns in bb.refHandlerIns: refIns.arg = branch_target_bb branch_target_bb.refHandlerIns = bb.refHandlerIns # Now iterate over all predecessors of this bb for j in range(len(bb.predecessors)): # Remove this bb from the successor list # Add branch target bb to the successor list bb.predecessors[j].successors.remove(bb) bb.predecessors[j].addSuccessor(branch_target_bb) branch_target_bb.addPredecessor(bb.predecessors[j]) last_ins = bb.predecessors[j].instructions[-1] if last_ins.opkode in opcode.hasjabs or last_ins.opkode in opcode.hasjrel: last_ins.arg = branch_target_bb del bb_list[i] foo = True break elif not bb.isHandler and len(bb.instructions) == 1: ins = bb.instructions[0] if insMnemonic(ins) == 'JUMP_FORWARD' or insMnemonic(ins) == 'JUMP_ABSOLUTE': branch_target_bb = bb.successors[0] # Branch target of this basic block branch_target_bb.predecessors.remove(bb) # Now iterate over all predecessors of this bb for j in range(len(bb.predecessors)): # Remove this bb from the successor list # Add branch target bb to the successor list bb.predecessors[j].successors.remove(bb) bb.predecessors[j].addSuccessor(branch_target_bb) branch_target_bb.addPredecessor(bb.predecessors[j]) last_ins = bb.predecessors[j].instructions[-1] if last_ins.opkode in opcode.hasjabs or last_ins.opkode in opcode.hasjrel: last_ins.arg = branch_target_bb del bb_list[i] foo = True break def simplifyPass2(bb_list): """ Merges a basic block into its predecessor if there is only one and the predecessor only has one successor. """ foo = True while foo: foo = False for i in range(len(bb_list)): bb = bb_list[i] # Not a handler block & has only 1 predecessor if not bb.isHandler and len(bb.predecessors) == 1: pred = bb.predecessors[0] # Predecessor has only 1 successor if len(pred.successors) == 1: # Merge this bb with its predecessor last_ins_pred = pred.instructions[-1] # If last instruction of predecessor is either JUMP_ABSOLUTE or JUMP_FORWARD, delete it if insMnemonic(last_ins_pred) == 'JUMP_ABSOLUTE' or insMnemonic(last_ins_pred) == 'JUMP_FORWARD': del pred.instructions[-1] # Append all instructions of current bb for ins in bb.instructions: pred.addInstruction(ins) del pred.successors[:] for succ in bb.successors: pred.addSuccessor(succ) succ.predecessors.remove(bb) succ.addPredecessor(pred) del bb_list[i] foo = True break def bbToDot(bb): dot = '<<table align = "left" border = "0">' if bb.isEntry: dot += '<tr><td align = "left"><font point-size = "8" color = "#9dd600">entrypoint:</font></td></tr>' elif bb.isHandler: dot += '<tr><td align = "left"><font point-size = "8" color = "#9dd600">handler:</font></td></tr>' #else: # dot += '<tr><td align = "left"><font point-size = "8" color = "#9dd600">off_{}:</font></td></tr>'.format(bb.addr) for ins in bb.instructions: dot += '<tr><td align = "left">{}</td></tr>'.format(insMnemonic(ins)) dot += '</table>>' return pydotplus.Node('off_{}'.format(bb.addr), shape='none', style='filled', color='#2d2d2d', label=dot, fontcolor='white', fontname='Consolas', fontsize='9') def buildEdges(graph, nodelist, bb_list): for i in range(len(bb_list)): bb = bb_list[i] for succ in bb.successors: graph.add_edge(pydotplus.Edge(nodelist[i], nodelist[bb_list.index(succ)])) def buildGraph(bb_list): graph = pydotplus.Dot(graph_type='digraph') # graph.set('splines', 'curved') nodelist = [] for bb in bb_list: node = bbToDot(bb) graph.add_node(node) nodelist.append(node) buildEdges(graph, nodelist, bb_list) graph.write_svg('1_d.svg') class Assembler: def __init__(self, bb_list): self.bb_list = bb_list self.a_postorder = [None] * len(bb_list) self.a_nblocks = 0 def assemble(self): for bb in self.bb_list: if bb.isEntry: self._dfs(bb) break # Can't modify the bytecode after computing jump offsets. self._assembleJumpOffsets() return self._emit() def _assembleIns(self, ins): size = ins.size if ins.opkode >= opcode.HAVE_ARGUMENT: arg = ins.arg if size == 1: return chr(ins.opkode) elif size == 3: return chr(ins.opkode) + chr(arg & 0xFF) + chr((arg >> 8) & 0xFF) else: raise Exception('EXTENDED_ARG not yet implemented') def _emit(self): code = cStringIO.StringIO() for i in range(len(self.a_postorder) - 1, -1, -1): bb = self.a_postorder[i] for ins in bb.instructions: code.write(self._assembleIns(ins)) return code.getvalue() def _dfs(self, bb): if bb.b_seen: return bb.b_seen = True if len(bb.successors) > 0: self._dfs(bb.successors[0]) for i in range(len(bb.instructions)): ins = bb.instructions[i] if isinstance(ins.arg, BasicBlock): #if ins.opkode in opcode.hasjabs or ins.opkode in opcode.hasjrel: self._dfs(ins.arg) if len(bb.successors) == 2: self._dfs(bb.successors[1]) self.a_postorder[self.a_nblocks] = bb self.a_nblocks += 1 def _assembleJumpOffsets(self): totsize = 0 # Iterate in reverse order and calculate the addresses of each bb for i in range(len(self.a_postorder) - 1, -1, -1): bsize = self.a_postorder[i].blockSize() self.a_postorder[i].addr = totsize totsize += bsize # We have calculated the offsets of each bb for bb in self.a_postorder: bsize = bb.addr for ins in bb.instructions: bsize += ins.size if ins.opkode in opcode.hasjabs: ins.arg = ins.arg.addr elif ins.opkode in opcode.hasjrel: ins.arg = ins.arg.addr - bsize def deobfuscate(code_object): assert isinstance(code_object, types.CodeType) oep = findOEP(code_object) if oep == -1: print 'Not generating cfg for ', code_object.co_name return code_object.co_code leader_set = findLeaders(code_object, oep) bb_list = buildBasicBlocks(leader_set, code_object, oep) buildPositionIndepedentBasicBlock(bb_list) print 'Original number of basic blocks: ', len(bb_list) simplifyPass1(bb_list) print 'Number of basic blocks after pass 1: ', len(bb_list) simplifyPass2(bb_list) print 'Number of basic blocks after pass 2: ', len(bb_list) #buildGraph(bb_list) return Assembler(bb_list).assemble() def recurseCodeObjects(code_obj): mod_const = [] for const in code_obj.co_consts: if isinstance(const, types.CodeType): mod_const.append(recurseCodeObjects(const)) else: mod_const.append(const) argcount = code_obj.co_argcount nlocals = code_obj.co_nlocals stacksize = code_obj.co_stacksize flags = code_obj.co_flags codestring = deobfuscate(code_obj) constants = tuple(mod_const) names = code_obj.co_names varnames = tuple('var{}'.format(i) for i in range(len(code_obj.co_varnames))) filename = code_obj.co_filename import random name = str(random.randint(100,999)) # 'renamed' # XXX: Use a better way firstlineno = code_obj.co_firstlineno lnotab = code_obj.co_lnotab return types.CodeType(argcount, nlocals, stacksize, \ flags, codestring, constants, names, \ varnames, filename, name, firstlineno, lnotab) def main(): fSrc = open('ob1.pyc', 'rb') fSrc.seek(8) c_obj = marshal.load(fSrc) fSrc.close() fOut = open('ob1_deobf.pyc', 'wb') fOut.write('\x03\xf3\x0d\x0a\0\0\0\0') marshal.dump(recurseCodeObjects(c_obj), fOut) fOut.close() if __name__ == '__main__': main() 3
madskillz Posted November 6, 2015 Posted November 6, 2015 (edited) @ExtremeCoders Can we deobfuscate files such as ? http://pastebin.com/jfpvdw2L Obfuscated with pyobfuscate Edited November 6, 2015 by madskillz
Extreme Coders Posted November 7, 2015 Posted November 7, 2015 @madskillz: That is source code obfuscation which is different from binary obfuscation. pyobfuscate operates on the source files whereas pjorion operates on the binary level. To reverse the obfuscation of pyobfuscate you need to develop a python source code parser. The ast, tokenize & parser modules would be a good starting point to learn about this. For manual deobfuscation use a good python ide such as pycharm.As an instance there are many redundant if's which are always true, if 30 - 30: o0oOOo0O0Ooo - O0 % o0oOOo0O0Ooo - OoooooooOO * O0 * OoooooooOO if 60 - 60: iIii1I11I1II1 / i1IIi * oO0o - I1ii11iIi11i + o0oOOo0O0Ooo if 94 - 94: i1IIi % Oo0Ooo if 68 - 68: Ii1I / O0pycharm can automatically refactor the code to remove such statements.
bomblader Posted November 20, 2015 Author Posted November 20, 2015 (edited) How about the "protect .pyc file" option in orion? Looks like it loads multiple times encoded code and run it in memory or something. Is there any way to "dump" the final loaded code from memory? Also, is there any way to re-assemble disassembled python code? Edited November 20, 2015 by bomblader
Extreme Coders Posted November 20, 2015 Posted November 20, 2015 12 minutes ago, bomblader said: Looks like it loads multiple times encoded code and run it in memory or something. Is there any way to "dump" the final loaded code from memory? There is nothing to dump as nothing is decrypted. All the protector does is control flow obfuscation. It takes the individual instructions, link them by unconditional jumps, and scatters them. 19 minutes ago, bomblader said: Also, is there any way to re-assemble disassembled python code? The deobfuscator already does that, but it needs more refinement to make it usable in all cases. I would eventually like to make this a full fledged deobfuscator but there are no promises when.
bomblader Posted November 20, 2015 Author Posted November 20, 2015 (edited) 24 minutes ago, Extreme Coders said: There is nothing to dump as nothing is decrypted. All the protector does is control flow obfuscation. It takes the individual instructions, link them by unconditional jumps, and scatters them. The deobfuscator already does that, but it needs more refinement to make it usable in all cases. I would eventually like to make this a full fledged deobfuscator but there are no promises when. I am talking about transforming the disassembled code into a .pyc file. (Not getting the original source code out of it) The "protect" option makes the disassembled file look like this: http://pastebin.com/NuGjqJDK (check the code ending stuff at the bottom) Edited November 20, 2015 by bomblader
Extreme Coders Posted May 11, 2016 Posted May 11, 2016 The project has been open sourced on github (https://github.com/extremecoders-re/PjOrion-Deobfuscator) Note that this is currently in pre-alpha stage which would be improved with time. You can read the related blog entry at https://0xec.blogspot.com/2016/05/pjorion-deobfuscator-open-sourced.html 6
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now