Obfuscation & Deobfuscation
48 files
- 
	
	
	Analysis of malware binaries is constantly becoming more difficult with introduction of many different types of code obfuscators. One common theme in all obfuscators is transformation of code into a complex representation. This process can be viewed as inverse of compiler optimization techniques and as such can be partially removed using optimization algorithms. This paper presents common obfuscation techniques and a process of adapting optimization algorithms for removing obfuscations. Additionally, a plug-in for the IDA Pro disassembler is presented that demonstrates usability of the proposed optimization process as well as a set of techniques to speed up the process of analyzing obfuscated code.
 - 154 Downloads
  Teddy RogersSubmitted Teddy RogersSubmitted
- 
	
	
	Nearly every malware sample is sheathed in an executable protection which must be removed before static analyses can proceed. Existing research has studied automatically unpacking certain protections, but has not yet caught up with many modern techniques. Contrary to prior assumptions, protected programs do not always have the property that they are reverted to a fully unprotected state at some point during the course of their execution. This work provides a novel technique for circumventing one of the most problematic features of modern software protections, so-called virtualization obfuscation. The technique enables analysis of heretofore impenetrable malware.
 - 315 Downloads
  Teddy RogersSubmitted Teddy RogersSubmitted
- 
	
	
	Program obfuscation is an important software protection technique that prevents attackers from revealing the programming logic and design of the software. We introduce translingual obfuscation, a new software obfuscation scheme which makes programs obscure by “misusing” the unique features of certain programming languages. Translingual obfuscation translates part of a program from its original language to another language which has a different programming paradigm and execution model, thus increasing program complexity and impeding reverse engineering. In this paper, we investigate the feasibility and effectiveness of translingual obfuscation with Prolog, a logic programming language. We implement translingual obfuscation in a tool called BABEL, which can selectively translate C functions into Prolog predicates. By leveraging two important features of the Prolog language, i.e., unification and backtracking, BABEL obfuscates both the data layout and control flow of C programs, making them much more difficult to reverse engineer. Our experiments show that BABEL provides effective and stealthy software obfuscation, while the cost is only modest compared to one of the most popular commercial obfuscators on the market. With BABEL, we verified the feasibility of translingual obfuscation, which we consider to be a promising new direction for software obfuscation.
 - 109 Downloads
  Teddy RogersSubmitted Teddy RogersSubmitted
- 
	
	
	This paper extends the idea of specializing modified interpreters for systematically generating obfuscated code. By using the Coq proof assistant we specify some elementary obfuscations and prove that the resulting distorted interpreter is correct, namely it preserves the intended semantics of programs. The paper shows how the semantic preservation proofs generated and verified in Coq can provide a measure of the quality of the obfuscation. In particular we can observe that there is a precise corresponding between the potency of the obfuscation and the complexity of the proof of semantics preservation. Our obfuscation can be easily integrated into the CompCert C compiler, providing the basis for a formally verified obfuscating compiler which can be applied to any C program.
 - 89 Downloads
  Teddy RogersSubmitted Teddy RogersSubmitted
- 
	
	
	We show that indistinguishability obfuscation implies that all functions with sufficient "pseudo-entropy" cannot be obfuscated under a virtual black box definition with a universal simulator. Let F = {fs} be a circuit family with super-polynomial pseudo-entropy, and suppose O is a candidate obfuscator with universal simulator S. We demonstrate the existence of an adversary A that, given the obfuscation O(fs), learns a predicate the simulator S cannot learn from the code of A and black-box access to fs. Furthermore, this is true in a strong sense: for any secret predicate P that is not learnable from black-box access to fs, there exists an adversary that given O(fs) efficiently recovers P (s), whereas given oracle access to fs and given the code of the adversary, it is computationally hard to recover P (s).
 
 We obtain this result by exploiting a connection between obfuscation with a universal simulator and obfuscation with auxiliary inputs, and by showing new impossibility results for obfuscation with auxiliary inputs.
 - 106 Downloads
  Teddy RogersSubmitted Teddy RogersSubmitted
- 
	
	
	Source code obfuscation is a protection mechanism widely used to limit the possibility of malicious reverse engineering or attack activities on a software system. Although several code obfuscation techniques and tools are available, little knowledge is available about the capability of obfuscation to reduce attackers efficiency, and the contexts in which such an efficiency may vary.
 
 This paper reports the outcome of two controlled experiments meant to measure the ability of subjects to understand and modify decompiled, obfuscated Java code, compared to decompiled, clear code.
 
 Results quantify to what extent code obfuscation is able to make attacks more difficult to be performed, and reveal that obfuscation can mitigate the effect of factors that can alter the likelihood of a successful attack, such as the attackers skill and experience, or the intrinsic characteristics of the system under attack.
 - 111 Downloads
  Teddy RogersSubmitted Teddy RogersSubmitted
- 
	
	
	Symbolic and concolic execution find important applications in a number of security-related program analyses, including analysis of malicious code. However, malicious code tend to very often be obfuscated, and current concolic analysis techniques have trouble dealing with some of these obfuscations, leading to imprecision and/or excessive resource usage. This paper discusses three such obfuscations: two of these are already found in obfuscation tools used by malware, while the third is a simple variation on an existing obfuscation technique. We show empirically that existing symbolic analyses are not robust against such obfuscations, and propose ways in which the problems can be mitigated using a combination of fine-grained bit-level taint analysis and architecture-aware constraint generations. Experimental results indicate that our approach is effective in allowing symbolic and concolic execution to handle such obfuscations.
 - 154 Downloads
  Teddy RogersSubmitted Teddy RogersSubmitted
- 
	
	
	In recent years code obfuscation has attracted research interest as a promising technique for protecting secret properties of programs. The basic idea of code obfuscation is to transform programs in order to hide their sensitive information while preserving their functionality. One of the major drawbacks of code obfuscation is the lack of a rigorous theoretical framework that makes it difficult to formally analyze and certify the effectiveness of obfuscating techniques. We face this problem by providing a formal framework for code obfuscation based on abstract interpretation and program semantics. In particular, we show that what is hidden and what is preserved by an obfuscating transformation can be expressed as abstract interpretations of program semantics. Being able to specify what is masked and what is preserved by an obfuscation allows us to understand its potency, namely the amount of obscurity that the transformation adds to programs. In the proposed framework, obfuscation and attackers are modeled as approximations of program semantics and the lattice of abstract interpretations provides a formal tool for comparing obfuscations with respect to their potency. In particular, we prove that our framework provides an adequate setting to measure not only the potency of an obfuscation but also its resilience, i.e., the difficulty of undoing the obfuscation. We consider code obfuscation by opaque predicate insertion and we show how the degree of abstraction needed to disclose different opaque predicates allows us to compare their potency and resilience.
 - 178 Downloads
  Teddy RogersSubmitted Teddy RogersSubmitted
- 
	
	
	Some time in the recent past, I stumbled upon a news on The Intercept, about a malware being used against some Argentine prosecutor, who was found dead under uncanny circumstances (Fig. 1 & 2). Interested, I decided to have a look at the malware. On further analysis, it revealed that it was a java based malware. Since java reversing is somewhat less documented, I decided to shed some light in this field, and the result of that is this document.
 - 131 Downloads
  Teddy RogersSubmitted Teddy RogersSubmitted
- 
	
	
	Control code obfuscation is intended to prevent malicious reverse engineering of software by masking the program control flow. The idea for further advancing the state of the art was presented in 2000 by WANG C. An obfuscating system for Java based on the ideas of WANG C is implemented and experimented. The experiment results show that obfuscation can be done efficiently with moderate increases in code size, execution times, while making the obfuscated code resilient to a variety of reverse engineering attacks.
 - 103 Downloads
  Teddy RogersSubmitted Teddy RogersSubmitted
- 
	
	
	Obfuscation of machine code programs is a form of protection of programs' code against unauthorized reading. The problem of obfuscation is quite fresh, because first papers connected directly to obfuscation appeared only few years ago, yet some advanced publications can be already found. We reviewed them, describing in details most important papers of Christian Collberg and Chenxi Wang.
 
 We proposed a formal model of program based on the analysis of changes in the usage of computer's resources utilized by the program. The model appeared to be useful for development of obfuscation methods working on the low level of programming. We showed that obfuscating transformation has some interesting properties and proved, that for machine programs it is possible to create a single-pass algorithm of obfuscation. Describing own classification of obfuscating transformations we described different methods of obfuscation from the low level point of view. Obtaining results of research on typical properties of structure of today's computers' programs we created an efficient method of redundant code generation, required during the process of obfuscation. On the base of theoretical analysis and experience of another scientists we proposed a basic algorithm of machine programs obfuscation, which was implemented for the RISC and CISC type processors.
 
 To estimate efficiency of the obfuscation process we proposed three analytical methods of quality measurements and results of empirical research. We created three algorithms of machine programs' complexity measurement. For the implementation we showed results of quality measurements, performed using analytical and empirical methods. The empirical measurements were done on three different groups of programmers. From the final results it can be concluded, what should be the form of an algorithm of obfuscation, giving almost one hundred percent safe protection against unauthorized analysis. In the final conclusions we estimated values of parameters of an obfuscation process, giving such good efficiency.
 - 145 Downloads
  Teddy RogersSubmitted Teddy RogersSubmitted
- 
	
	
	In this technical report, we discuss the use of code obfuscation as means of protecting the intellectual property of software. An obfuscation is a behaviour preserving program transformation which aims to make a program harder to understand (which can mean it becomes unintelligible to automated program comprehension tools or that the result of program analyses become less useful to a human adversary). This report consists of three main parts. The first part of the report discusses some of the possible definitions of obfuscation and gives an extensive survey of some of the current obfuscation techniques. The second part considers a report written by University of Applied Sciences of Upper Austria, Hagenberg which discuss a variety of different protection techniques (including obfuscation). The last part reviews the techniques that we have seen so far. We provide an analysis of some of the reasons why current obfuscators are generally weak and why some of the better obfuscation techniques have not been implemented. Finally, we give recommendations for how better obfuscators can be created using obfuscation techniques which have not yet been implemented.
 - 97 Downloads
  Teddy RogersSubmitted Teddy RogersSubmitted
- 
	
	
	Array restructuring operations obscure arrays. Our work aims on java source code obfuscation containing arrays. Our main proposal is Classes with restructured array members and obscured member methods for setting, getting array elements and to get the length of arrays. The class method definition codes are obscured through index transformation and constant hiding. The instantiated objects of these classes are used for source code writing. A tool named JDATATRANS is developed for generating classes and to the best of our knowledge this is the first tool available for array restructuring, on Java source codes.
 - 97 Downloads
  Teddy RogersSubmitted Teddy RogersSubmitted
- 
	
	
	This paper presents LOCO, a graphical, interactive environment to experiment with code obfuscation and deobfuscation transformations, which can be applied automatically, semi-automatically and by hand. LOCO is an extension of the multi-platform visualization tool LANCET, combined with an obfuscation infrastructure in the underlying link-time program rewriter DIABLO. By use of LOCO, a developer can easily navigate through the control flow graph of a program and do fine-grained obfuscation, test new obfuscation transformations, test the robustness of existing transformations or improve existing transformations.
 - 115 Downloads
  Teddy RogersSubmitted Teddy RogersSubmitted
- 
	
	
	Binary obfuscation plays an essential role in evading malware static analysis and detection. The widely used code obfuscation techniques, such as polymorphism and metamorphism, focus on evading syntax based detection. However, statistic test and semantic analysis techniques have been developed to thwart their evasion attempts. More recent binary obfuscation techniques are divided in their purposes of attacking either statistical or semantic approach, but not both. In this paper, we introduce mimimorphism, a novel binary obfuscation technique with the potential of evading both statistical and semantic detections. Mimimorphic malware uses instruction-syntax-aware high-order mimic functions to transform its binary into mimicry executables that exhibit high similarity to benign programs in terms of statistical properties and semantic characteristics. We implement a prototype of the mimimorphic engine on the Intel x86 platform, and evaluate its capability of evading statistical anomaly detection and semantic analysis detection techniques. Our experimental results demonstrate that the mimicry executables are indistinguishable from benign programs in terms of byte frequency distribution and entropy, as well as control flow fingerprint.
 - 112 Downloads
  Teddy RogersSubmitted Teddy RogersSubmitted
- 
	
	
	A software obfuscator transforms a program into another executable one with the same functionality but unreadable code implementation. This paper presents an algorithm of multi-stage software obfuscation method using improved virtual machine techniques. The key idea is to iteratively obfuscate a program for many times in using different interpretations. An improved virtual machine (VM) core is appended to the protected program for byte-code interpretation. Adversaries will need to crack all intermediate results in order to figure out the structure of original code. Compared with existing obfuscators, our new obfuscator generates the protected code which performs more efficiently, and enjoys proven higher level security.
 - 109 Downloads
  Teddy RogersSubmitted Teddy RogersSubmitted
- 
	
	
	In this paper we show how abstract interpretation, and more specifically completeness, provides an adequate model for reasoning about code obfuscation and watermarking. The idea is that making a program obscure, or equivalently hiding information in it, corresponds to force an interpreter (the attacker) to become incomplete in its attempts to extract information about the program. Here abstract interpretation provides the model of the attacker (malicious host) and abstract interpretation transformers provide driving methods for understanding and designing new obfuscation and watermarking strategies: Obfuscation corresponds to make the malicious host incomplete and watermarking corresponds to hide secrets where incomplete attackers cannot extract them unless some secret key is given.
 - 92 Downloads
  Teddy RogersSubmitted Teddy RogersSubmitted
- 
	
	
	The standard idea of code aesthetics, when such an idea manifests itself at all, allows for programmers to have elegance and clarity as their standards. This paper explores programming practices in which other values are at work, showing that the aesthetics of code must be enlarged to accommodate them. The two practices considered are obfuscated programming and the creation of "weird languages" for coding. Connections between these two practices, and between these and other mechanical and literary aesthetic traditions, are discussed.
 - 85 Downloads
  Teddy RogersSubmitted Teddy RogersSubmitted
- 
	
	
	This paper presents a tool BE-PUM (Binary Emulator for PUshdown Model generation), which generates a precise control flow graph (CFG), under presence of typical obfuscation techniques of malware, e.g., indirect jump, self-modification, overlapping instructions, and structured exception handler (SEH), which cover packers. Experiments are performed on 2000 real-world malware examples taken from VX Heaven and compare the results of a popular commercial disassembler IDA Pro, a state-of-the-art tool JakStab, and BE-PUM. It shows that BE-PUM correctly traces CFGs, whereas IDA Pro and JakStab fail. By manual inspection on 300 malware examples, we also observe that the starts of these failures exactly locate the entries of obfuscation code.
 - 93 Downloads
  Teddy RogersSubmitted Teddy RogersSubmitted
- 
	
	
	The purpose of this paper is to introduce a further measurement for software obfuscation, in particular observing that many important obfuscation transformations increase the uncertainty an attacker has about the program behavior, uncertainty modeled by the entropy of the program traces or the nodes under execution. The transformations considered in this paper are unknown opaque predicates insertions or unknown dispatcher insertions, where the latter are an extension of the if-else statements of unknown opaque predicates to switch-case statements. Consequences of modeling obfuscation as an increase of entropy can be simple guidelines to obtain potent transformations at low cost and the explanation of existing transformations effectiveness. We present a program transformation algorithm based on the latter observations.
 - 101 Downloads
  Teddy RogersSubmitted Teddy RogersSubmitted
- 
	
	
	Barak et al. gave a first formalization of obfuscation, describing an obfuscator O as an efficient, probabilistic "compiler" that takes in input a program P and produces a new program O (P ) that has the same functionality as P but is unintelligible. This mean that any result an obfuscated program can compute is actually computable given only an input/output access (called oracle access) to the program P: we call such results trivial results. On the basis of this informal definition, they suggest a formal definition of obfuscation based on oracle access to programs and show that no obfuscator can exist according to this definition. They also try to relax the definition and show that, even with a restriction to some common classes of programs, there exists no obfuscator.
 
 In this work, we show that their definition is inaccurate and lacks a fundamental property, that we formalize by the notion of oracle programs. Oracle programs are an abstract notion which basically refers to perfectly obfuscated programs. We suggest a new definition of obfuscation based on these oracle programs and show that such obfuscators do not exist either. Considering the actual implementations of "obfuscators", we define a new kind of obfuscators, t-obfuscators. These are obfuscators that hide non trivial results at least for time t. By restricting the t-requirement to deobfuscation (that is outputting an intelligible program when fed with an obfuscated program in input), we show that such obfuscators do exist. Practical t-obfuscation methods are presented at the end of this paper: we focus more specifically on code protection techniques in a malware context.
 
 Based on the fact that a malware may fulfill its action in an amount of time which may be far larger than the analysis time of any automated detection program, these obfuscation methods can be considered as efficient enough to greatly thwart automated analysis and put check on any antivirus software.
 - 103 Downloads
  Teddy RogersSubmitted Teddy RogersSubmitted
- 
	
	
	The goal of program obfuscation is to provably complicate the task of reverse engineering.
 
 Owners of source code wish to make it as hard as possible for others to extract information from their binaries - information which may reveal trade secrets, allow undesirable modifications, or enable software piracy.
 
 Obfuscation is typically seen as a problem relating to programming languages or compilers. The connection to cryptography is not entirely clear and was only first noticed in 2000. Since then, a few papers have been published every year on the subject.
 - 101 Downloads
  Teddy RogersSubmitted Teddy RogersSubmitted
- 
	
	
	In recent years, code obfuscation has attracted attention as a low cost approach to improving software security by making it difficult for attackers to understand the inner workings of proprietary software systems. This paper examines techniques for automatic deobfuscation of obfuscated programs, as a step towards reverse engineering such programs. Our results indicate that much of the effects of code obfuscation, designed to increase the difficulty of static analyses, can be defeated using simple combinations of straightforward static and dynamic analyses. Our results have applications to both software engineering and software security. In the context of software engineering, we show how dynamic analyses can be used to enhance reverse engineering, even for code that has been designed to be difficult to reverse engineer. For software security, our results serve as an attack model for code obfuscators, and can help with the development of obfuscation techniques that are more resilient to straightforward reverse engineering.
 - 178 Downloads
  Teddy RogersSubmitted Teddy RogersSubmitted
- 
	
	
	Microsoft's .NET Framework, and JAVA platforms, are based in a just-in-time compilation philosophy. Software developed using these technologies is executed in a hardware independent framework, which provides a full object-oriented environment, and in some cases allows the interaction of several components written in different programming languages. This flexibility is achieved by compiling into an intermediate code which is platform independent. Java is compiled into ByteCode, and Microsoft .NET programs are compiled into MSIL (Microsoft Intermediate Code). However, this flexibility comes with a price. With freeware tools available in Internet, it is quite easy to decompile intermediate codes and obtain a working, readable version of the source code. Obfuscation is the most accepted and commercially available technique that developers can use to protect their intellectual property In this work, we propose the use of try-catch mechanisms available in .NET as a way to improve the quality of one of the building blocks of obfuscation: opaque predicates.
 - 81 Downloads
  Teddy RogersSubmitted Teddy RogersSubmitted
- 
	
	
	When new malware are discovered, it is important for researchers to analyze and understand them as quickly as possible. This task has been made more difficult in recent years as researchers have seen an increasing use of virtualization-obfuscated malware code. These programs are difficult to comprehend and reverse engineer, since they are resistant to both static and dynamic analysis tech-techniques. Current approaches to dealing with such code first reverse-engineer the byte code interpreter, then use this to work out the logic of the byte code program. This outside-in approach produces good results when the structure of the interpreter is known, but cannot be applied to all cases. This paper proposes a different approach to the problem that focuses on identifying instructions that affect the observable behaviour of the obfuscated code. This inside-out approach requires fewer assumptions, and aims to complement existing techniques by broadening the domain of obfuscated programs eligible for automated analysis. Results from a prototype tool on real-world malicious code are encouraging.
 - 128 Downloads
  Teddy RogersSubmitted Teddy RogersSubmitted
Download Statistics
- 2,147 Files
- 323 Comments
- 894 Reviews
- 
				
