Jump to content
Tuts 4 You

Obfuscation & Deobfuscation

48 files

  1. A Taxonomy of Obfuscating Transformations

    It has become more and more common to distribute software in forms that retain most or all of the information present in the original source code. An important example is Java bytecode. Since such codes are easy to decompile, they increase the risk of malicious reverse engineering attacks.

    In this paper we review several techniques for technical protection of software secrets. We will argue that automatic code obfuscation is currently the viable method for preventing reverse engineering. We then describe the design of a code obfuscator, a tool which converts a program into an equivalent one that is more difficult to understand and reverse engineer.

    The obfuscator is based on the application of code transformations, in many cases similar to those used by compiler optimizers. We describe a large number of such transformations, classify them, and evaluate them with respect to their potency (To what degree is a human reader confused?), resilience (How well are automatic deobfuscation attacks resisted?), and cost (How much overhead is added to the application?).

    We finally discuss some possible deobfuscation techniques (such as program slicing) and possible counter-measures an obfuscator could employ against them.

    18 downloads

    0 comments

    Submitted

  2. A Toolkit for Code Obfuscation

    According to Business Software Alliance statistics, four out of every ten software programs is pirated in software business, world wide. Global piracy rate has increased 40% over the past years and nearly $11 billion is lost. This is definitely a clear threat for software producers and thus to global economy. Over the years, several software protection techniques have been developed, code obfuscation is one of them and it is very promising. Code obfuscation is a form of software protection against unauthorized reverse-engineering. In this paper we discuss software protection techniques in general and provide a broad overview of known obfuscation algorithms. We also address the issues related to implementation of obfuscation algorithms. Finally we propose JHide, an obfuscation tool kit for protection of Java code. We conclude our paper identifying the need for reviewing the performance of the algorithms as the future scope of our work.

    20 downloads

    0 comments

    Submitted

  3. Application Security Through Program Obfuscation

    Business models behind products such as iTunes and the Skype VoIP clients depend entirely on the secrecy of technical details of their product. Once the technical details are uncovered, a medium such as the Internet is extremely powerful to (anonymously) spread the sensitive information and it is shown that stopping the spread of such highly sensitive information is difficult. Therefore, program obfuscation recently attracted a lot of attention as a low cost approach to protect the inner workings of an application. However, when a new obfuscating transformation is proposed, it is unclear how to measure the quality of such transformation as there is no general agreement on this matter in this young domain.

    Collberg's taxonomy describes the quality of an obfuscating transformation in terms of cost, resilience and potency. The cost describes the execution penalty, the resilience measures how well a transformation withstands an attack while the potency measures how much more difficult the obfuscated code is to understand.

    Our work contributes by describing attacks that test the resilience of an obfuscating transformation and by the construction of a framework based on software complexity metrics to evaluate the potency of obfuscating transformations. In this dissertation, we bring together existing control flow obfuscating transformations and existing software complexity metrics. In particular, we consider three transformations: control flowflattening (CFF), branch procedures and opaque predicates together with two metrics: cyclomatic number and knot count. After applying the obfuscating transformations on a program, the complexity of the program increases. To measure this, our framework has to be capable of quantifying the obfuscating transformation independent of at which point in the development process the obfuscating transformation is applied. Therefore, our introduced framework works on the machine code. The machine code is represented by means of a static control flow graph. However, in this dissertation we also propose, for the first time, to construct a representation that is based on dynamically generated information.

    After we have shown that the proposed obfuscating transformations increase the complexity of the programs,we look for inverse transformations. A successful inverse transformation should bring the complexity of the program close to the original complexity, but not necessarily reconstruct the original program. The first de-obfuscating transformation is called cloning. By judiciously duplicating portions of the program, spurious execution paths no longer taint the original program execution paths. By studying the duplicated portions, it is possible to construct inverse transformations breaking CFF and the insertion of branch procedures.

    The second technique proposed to revert control flow obfuscating transformations is based on static feasible path analysis. Control flow obfuscating transformations aim to insert paths which are infeasible and the goal of our analysis is to detect these infeasible paths. First, a constraint is constructed for a given path through the program. Then, the analysis will determine whether that constraint is feasible or infeasible. This way, we are able to revert the CFF transformation. We also used the transformation to detect inserted opaque predicates – which are boolean valued expressions whose values are known to the obfuscator but difficult to determine afterwards – and by use of abstract interpretation we even reduced the number of elements in the domain. However, we discovered that this technique to detect opaque predicates is limited.

    The aforementioned inverse techniques assume the presence of a conservative representation of the program. However, it is not always the case that such a representation can easily be derived from a program. To overcome this problem we introduced a third technique based on static and dynamic analyses. The new technique is successful even without the presence of a full representation of the program. It suffices to statically identify program parts and to observe their behaviour during execution. By using this technique, we were able to break CFF and branch procedures without the construction of a control flow graph. Because of the lack of general applicability of the attack based on static feasible path analysis for opaque predicates, we propose a last de-obfuscation technique. This last technique targets opaque predicates of all kinds where the goal is to find the conditional branches controlled by an opaque predicate. Similar to the previous technique, this attack involves both static and dynamic components. First, by executing the program it is possible to identify a set of candidates. Then, this set of opaque predicates is narrowed by injecting both well-chosen and randomly generated numbers in the code and observing the particular conditional branch. This technique is able to identify 99% of the inserted opaque predicates.

    The complexity metrics of the programs after de-obfuscation indicate that the complexity is only marginally larger than the complexity of the original programs. For opaque predicates, we go into much more details because we can distinguish two types of errors. The underestimation error denotes the number of not found opaque predicates and the overestimation error denotes the number of regular conditional branches wrongly indicated as governed by an opaque predicate.

    We have demonstrated that existing obfuscating transformations increase the complexity of a program. At the same time, these obfuscating transformations are vulnerable to attacks, reducing the complexity close to the complexity of the original program. Therefore, we propose an obfuscating transformation based on self-modifying code that ismore resilient against attacks. We provide solutions to protect against an insider attack. The technique is suitable for application security because it breaks assumptions made by state-of-the-art tools. Based on self-modifying code, we introduce a technique called trace obfuscation. Trace obfuscation forces the attacker to perform program understanding on the trace instead of on a graph. One proposed trace obfuscation transformation combines dynamic software mutation, cloning, diversity and obfuscating predicates.

    There are three major contributions in this dissertation. First of all, a framework to measure program complexity is proposed and is applied to existing control flow obfuscating transformations. Secondly, we introduced inverse transformations attacking the control flow obfuscating transformations. Lastly, we introduced a new obfuscating transformation called trace obfuscation.

    16 downloads

    0 comments

    Submitted

  4. Applied Binary Code Obfuscation

    An obfuscated code is the one that is hard (but not impossible) to read and understand. Sometimes corporate developers, programmers and malware coders for security reasons, intentionally obfuscate their software in an attempt to delay reverse engineering or confuse antivirus engines from identifying malicious behaviours. Nowadays, obfuscation is often applied to object oriented cross-platform programming languages like Java, .NET (C#, VB), Perl, Ruby, Python and PHP. That is because their code can be easily decompiled and examined making them vulnerable to reverse engineering. On the other hand, obfuscating binary code is not as easy as encrypting object or function names as it is done in programming languages mentioned above. In this case, the code is altered by using a variety of transformations, for instance self modifying code, stack operations or even splitting the factors of simple mathematical functions. Moreover, binary obfuscation is also used to defeat automated network traffic analyzers such like Intrusion Detection and Prevention Systems. In other words, binary code obfuscation is the technique of altering the original code structure and maintaining its original functionality. In the next pages of this paper we will explore the theory and practice of binary code obfuscation as well as a number of various techniques that can be used.

    19 downloads

    0 comments

    Submitted

  5. Array Data Transformation for Source Code

    Obfuscation is a low cost software protection methodology to avoid reverse engineering and re-engineering of applications. Source code obfuscation aims in obscuring the source code to hide the functionality of the codes. This paper proposes Array data transformation in order to obfuscate the source code which uses arrays. The applications using the proposed data structures force the programmer to obscure the logic manually. It makes the developed obscured codes hard to reverse engineer and also protects the functionality of the codes.

    17 downloads

    0 comments

    Submitted

  6. Automated Approach to the Identification and Removal of Code Obfuscation

    Malware authors and owners of proprietary software algorithms often use code obfuscation techniques to hinder users from gaining understanding about the integral parts of their applications. Simple instruction sequences are obscured, control flow is disorganized, and unnecessary instructions are introduced to confuse disassembly tools, and the reverse engineer.

    The Deobfuscator combines instruction emulation and pattern recognition to determine code control flow, interpret the intended results of obfuscated code, and transform instruction sequences to enhance the readability of code where all states are known.

    25 downloads

    0 comments

    Submitted

  7. Automatic Binary Deobfuscation

    This paper gives an overview of our research in the automation of the process of software protection analysis. We will focus more particularly on the problem of obfuscation.

    Our current approach is based on a local semantic analysis, which aims to rewrite the binary code in a simpler (easier to understand) way. This approach has the advantage of not relying on a manual search for patterns of obfuscation. This way of manipulating the code is, at the end, quite similar to the optimising stage of most of compilers. We will exhibit concrete results based on the development of a prototype and its application to a test target. Current limitations and future prospects will be discussed in as well.

    As a continuation of our work from last year, we focus on the automation of the software protection analysis process. We will focus more particularly on the problem of obfuscation.

    This problem is crucial as most malicious binaries (like viruses or trojans) use this kind of protection to slow down their analysis and to make their detection harder. Automation is a key step in order to face the constant growth of the amount of malware, year after year.

    Our previous paper was mainly focused on the attack and suppression of protection mechanisms using the Metasm framework. It provides many useful primitives to deal with protected code: control flow graph manipulation, recompilation, filtering processor, nevertheless most of these approaches rely on a tedious work of manual identification of the patterns used by the protection.

    We will now present the development of our new methods, relying on a semantic analysis of the binary code to extract a simpler representation. The objective is no longer to seek and destroy known patterns, but to proceed to a complete, on-the-fly, optimised code rewriting.

    We will exhibit concrete results obtained by applying these methods to a test target. Then, current limitations and future prospects will be discussed.

    20 downloads

    0 comments

    Submitted

  8. Automatic Deobfuscation of Emulation-Obfuscated Software

    Malicious software are usually obuscated to avoid detection and resist analysis. When new malware is encountered, such obfuscations have to be penetrated or removed (deobfuscated) in order to understand the internal logic of the code and devise countermeasures. This paper discusses an approach for deobfuscation of code that uses emulation-based obfuscation, a particularly challenging class of obfuscations that have deployed in recent years. Our approach is highly general in that we do not make any assumptions about the nature of the obfuscations used; instead, we use semantics ­preserving program transformations to simplify away obfuscation code. Experiments show that our approach is effective in extracting the internal logic from code obfuscated using a variety of emulation-based obfuscators, including tools such as Themida that previous approaches could not handle.

    27 downloads

    0 comments

    Submitted

  9. Automatic Simplification of Obfuscated Javascript Code

    Javascript is a scripting language that is commonly used to create sophisticated interactive client-side web applications. It can also be used to carry out browser-based attacks on users. Malicious JavaScript code is usually highly obfuscated, making detection a challenge. This paper describes a simple approach to deobfuscation of JavaScript code based on dynamic analysis and slicing. Experiments using a prototype implementation indicate that our approach is able to penetrate multiple layers of complex obfuscations and extract the core logic of the computation.

    18 downloads

    0 comments

    Submitted

  10. Basing Obfuscation on Simple Tamper-Proof Hardware Assumptions

    Code obfuscation is one of the most powerful concepts in cryptography. It could yield functional encryption, digital rights management, and maybe even secure cloud computing. However, general code obfuscation has been proven impossible and the research then focused on obfuscating very specific functions, studying weaker security definitions for obfuscation, and using tamper-proof hardware tokens to achieve general code obfuscation. Following this last line this work presents the first scheme which bases general code obfuscation of multiple programs on one single stateless hardware token.
    Our construction is proven secure in the UC-framework and proceeds in three steps:

    1.
    We construct an obfuscation scheme based on fully homomorphic encryption (FHE) and a hybrid functionality conditional decrypt, which decrypts the result of a homomorphic computation given a proof that the computation was performed as intended. One difficulty of the first step are possible decryptions errors in the FHE. These decryption errors can occur whenever the randomness for the encryption is chosen maliciously by the receiver of the obfuscated code. Such decryption errors then could make a real obfuscated computation distinguishable from a black box use of the non-obfuscated program.

    2.
    Given two common reference strings (CRS) we construct a UC-protocol realizing the functionality conditional decrypt with a stateless hardware token. As the token is stateless it is resettable by a dishonest receiver and the proofs given to the token must be resettably sound. One additional difficulty occurs when the issuer of the token can be corrupted. A malicious token can be stateful and it cannot be prevented that it aborts after a hardwired number of invocations. To prevent adaptive behavior of a malicious token the data of the receiver has to be hidden from the token and the proofs given to the token must even hide the size of the program and the length of the computation.

    3.
    Last we construct a protocol constructing a CRS with a stateless hardware token. Care has to be taken here to not let the token learn anything about the resulting CRS which could not be simulated, because the very same token will later be used in a protocol based on the security of this CRS.

    15 downloads

    0 comments

    Submitted

  11. Behavioral Analysis of Obfuscated Code

    Classically, the procedure for reverse engineering binary code is to use a disassembler and to manually reconstruct the logic of the original program. Unfortunately, this is not always practical as obfuscation can make the binary extremely large by over-complicating the program logic or adding bogus code.

    We present a novel approach, based on extracting semantic information by analyzing the behavior of the execution of a program. As obfuscation consists in manipulating the program while keeping its functionality, we argue that there are some characteristics of the execution that are strictly correlated with the underlying logic of the code and are invariant after applying obfuscation. We aim at highlighting these patterns, by introducing different techniques for processing memory and execution traces.

    Our goal is to identify interesting portions of the traces by finding patterns that depend on the original semantics of the program. Using this approach the high-level information about the business logic is revealed and the amount of binary code to be analyze is considerable reduced.

    For testing and simulations we used obfuscated code of cryptographic algorithms, as our focus are DRM system and mobile banking applications. We argue however that the methods presented in this work are generic and apply to other domains were obfuscated code is used.

    21 downloads

    0 comments

    Submitted

  12. Binary Code Obfuscation Through C++ Template Meta-Programming

    Defending programs against illegitimate use and tampering has become both a field of study and a large industry. Code obfuscation is one of several strategies to stop, or slow down, malicious attackers from gaining knowledge about the internal workings of a program. Binary code obfuscation tools often come in two (sometimes overlapping) flavors. On the one hand there are binary protectors, tools outside of the development chain that translate a compiled binary into another, less intelligible one. On the other hand there are software development kits that require a significant effort from the developer to ensure the program is adequately obfuscated. In this paper, we present obfuscation methods that are easily integrated into the development chain of C++ programs, by using the compiler itself to perform the obfuscated code generation. This is accomplished by using advanced C++ techniques, such as operator overloading, template metaprogramming, expression templates, and more. We achieve obfus­cated code featuring randomization, opaque predicates and data masking. We evaluate our obfuscating transformations in terms of potency, resilience, stealth, and cost.

    16 downloads

    0 comments

    Submitted

  13. Binary Code Obfuscations in Prevalent Packer Tools

    Security analysts' understanding of the behavior and intent of malware samples depends on their ability to build high-level analysis products from the raw bytes of program binaries. Thus, the first steps in analyzing defensive malware are understanding what obfuscations are present in real-world malware binaries, how these obfuscations hinder analysis, and how they can be overcome. To this end, we present a thorough examination of the obfuscation techniques used by the packer tools that are most popular with malware authors [Bustamante 2008]. Though previous studies have discussed the current state of binary packing [Yason 2007], anti-debugging [Falliere 2007], and anti-unpacking [Ferrie 2008a] techniques, there have been no comprehensive studies of the obfuscation techniques that are applied to binary code. While some of the individual obfuscations that we discuss have been reported independently, this paper consolidates the discussion while adding substantial depth and breadth to it.

    We describe obfuscations that make binary code difficult to discover (e.g., control-transfer obfuscations, exception-based control transfers, incremental code unpacking, code overwriting); to accurately disassemble into instructions (e.g., ambiguous code and data, disassembler fuzz-testing, non-returning calls); to structure into functions and basic blocks (e.g., obfuscated calls and returns, call-stack tampering, overlapping functions and basic blocks); to understand (e.g., obfuscated constants, calling-convention violations, chunked control-flow, do-nothing code); and to manipulate (e.g., self-checksumming, anti-relocation, stolen-bytes techniques). We also discuss techniques that mitigate the impact of these obfuscations on analysis tools such as disassemblers, decompilers, instrumenters, and emulators. This work is done in the context of our project to build tools for the analysis [Jacobson et al. 2011; Rosenblum et al. 2008] and instrumentation [Bernat and Miller 2011; Hollingsworth et al. 1994] of binaries, and builds on recent work that extends these analyses to malware binaries that are highly defensive [Bernat et al. 2011; Roundy and Miller 2010].

    We begin by describing the methodology and tools we used to perform this study. We proceed to a taxonomy of the obfuscation techniques, along with current approaches to dealing with these techniques, and conclude by presenting statistics and observations on the various obfuscation techniques and tools.

    26 downloads

    0 comments

    Submitted

  14. Breaking and Improving Protocol Obfuscation

    Different techniques for traffic classiffication are utilized in various fields of application. In this technical report, we look closer on how statistical analysis can be used to identify network protocols. We show how even obfuscated application layer protocols, such as BitTorrent's MSE protocol and Skype, can be identified by fingerprinting statistically measurable properties of TCP and UDP sessions. We also look closer on the properties our protocol identification algorithm exploits to identify these obfuscated protocols“ protocols that are designed not to be detectable and are thus considered to be very hard to classify. Many of the analyzed protocols are shown to have statistically measurable properties in payload data, flow behavior, or both. Based on this new insight, we propose techniques that can improve future versions of obfuscated protocols, inhibiting identification through this type of statistical analysis. These techniques include better obfuscation of payload data and flow features as well as hiding inside tunnels of well known protocols. This report is intended to provide feedback and suggestions for improvement to creators of obfuscated network protocols, and should thus help to facilitate sustained network neutrality on the Internet.

    15 downloads

    0 comments

    Submitted

  15. Code Deobfuscation

    Measuring the security of code obfuscation is difficult. A novel obfuscation transformation is in some cases only measured in terms of code expansion and speed, which are in fact only side effects of the transformation. A first step to define a security value to an obfuscation transformation could be having a look at what a cracker is able to reveal from an obfuscated program. This abstract first of all gives a short overview of existing techniques to obfuscate. Then, we describe existing techniques that can be used to deobfuscate, which were sometimes originally meant for other purposes, and new techniques which we are working on to deobfuscate.

    26 downloads

    0 comments

    Submitted

  16. Code Obfuscation and Lighty Compressor Unpacking

    When I first started this article I had no idea what "Lighty Compressor" is. After a little research I found out that it's a code compressor mostly used in the malware developing scene, which means it's not freely downloadable.

    The text below does not pretend to be professionally written, and I don't pretend to be a reverse engineering expert. However, this is my approach of defeating code obfuscation and Lighty's compression.

    The application I unpack in the lines below is an old malware sample, probably from the end of 2008, and it's called "buritos.exe".
    So, get yourself a beer and continue reading!

    17 downloads

    0 comments

    Submitted

  17. Code Obfuscation and Malware Detection by Abstract Interpretation

    An obfuscating transformation aims at confusing a program in order to make it more difficult to understand while preserving its functionality. Software protection and malware detection are two major applications of code obfuscation. Software developers use code obfuscation in order to defend their programs against attacks to the intellectual property, usually called malicious host attacks. In fact, by making the programs more difficult to understand it is possible to obstruct malicious reverse engineering“ a typical attack to the intellectual property of programs. On the other side, malware writers usually obfuscate their malicious code in order to avoid detection. In this setting, the ability of code obfuscation to foil most of the existing detection techniques, such as misuse detection algorithms, relies in their purely syntactic nature that makes malware detection sensitive to slight modifications of programs syntax. In the software protection scenario, researchers try to develop sophisticated obfuscating techniques that are able to resist as many attacks as possible. In the malware detection scenario, on the other hand, it is important to design advanced detection algorithms in order to detect as many variants of obfuscated malware as possible. It is clear how both malicious host and malicious code attacks represent harmful threats to the security of computer networks.

    16 downloads

    0 comments

    Submitted

  18. Code Obfuscation Literature Survey

    In this paper we survey the current literature on code obfuscation and review current practices as well as applications. We analyse the different obfuscation techniques in relation to protection of intellectual property and the hiding of malicious code. Surprisingly, the same techniques used to thwart reverse engineers are used to hide malicious code from virus scanners. Additionally, obfuscation can be used to protect against malicious code injection and attacks. Though obfuscation transformations can protect code, they have limitations in the form of larger code footprints and reduced performance.

    16 downloads

    0 comments

    Submitted

  19. Concepts and Techniques in Software Watermarking and Obfuscation

    With the rapid development of the internet, copying a digital document is so easy and economically affordable that digital piracy is rampant. As a result, software protection has become a vital issue in current computer industry and a hot research topic.

    Software watermarking and obfuscation are techniques to protect software from unauthorized access, modification, and tampering. While software watermarking tries to insert a secret message called software watermark into the software program as evidence of ownership, software obfuscation translates software into a semantically-equivalent one that is hard for attackers to analyse. In this thesis, firstly, we present a survey of software watermarking and obfuscation. Then we formalize two important concepts in software watermarking: extraction and recognition and we use a concrete software watermarking algorithm to illustrate issues in these two concepts. We develop a technique called the homomorphic functions through residue numbers to obfuscate variables and data structures in software programs. Lastly, we explore the complexity issues in software watermarking and obfuscation.

    15 downloads

    0 comments

    Submitted

  20. Context-Sensitive Analysis of Obfuscated x86 Executables

    A method for context-sensitive analysis of binaries that may have obfuscated procedure call and return operations is presented. Such binaries may use operators to directly manipulate stack instead of using native call and ret instructions to achieve equivalent behavior. Since definition of context-sensitivity and algorithms for contextsensitive analysis have thus far been based on the specific semantics associated to procedure call and return operations, classic interprocedural analyses cannot be used reliably for analyzing programs in which these operations cannot be discerned. A new notion of context-sensitivity is introduced that is based on the state of the stack at any instruction. While changes in ‘calling’-context are associated with transfer of control, and hence can be reasoned in terms of paths in an interprocedural control flow graph (ICFG), the same is not true of changes in ‘stack’-context. An abstract interpretation based framework is developed to reason about stackcontexts and to derive analogues of call-strings based methods for the context-sensitive analysis using stack-context. The method presented is used to create a context-sensitive version of Venable et al.’s algorithm for detecting obfuscated calls. Experimental results show that the context-sensitive version of the algorithm generates more precise results and is also computationally more efficient than its context-insensitive counterpart.

    20 downloads

    0 comments

    Submitted

  21. Control Code Obfuscation by Abstract Interpretation

    Control code obfuscation is intended to prevent malicious reverse engineering of software by masking the program control flow. These obfuscating transformations often rely on the existence of opaque predicates, that support the design of transformations that break up the program control flow. We prove that an algorithm for control obfuscation by opaque predicate insertion can be systematically derived as an abstraction of a suitable semantic transformation. In this framework, deobfuscation is interpreted as an attacker which can observe the computational behaviour of programs up to a given precision degree. Both obfuscation and deobfuscation can therefore be interpreted as approximations of program semantics, where approximation is formalized using abstract interpretation theory. In particular we prove that abstract interpretation provides here the adequate setting to measure the potency of an obfuscation algorithm by comparing the degree of abstraction of the most abstract domains which are able to disclose opaque predicates.

    13 downloads

    0 comments

    Submitted

  22. Deobfuscation and Detection of Malicious PDF Files with High Accuracy

    Due to its high popularity and rich functionality, the Portable Document Format (PDF) has become a major vector for malware propagation. To detect malicious PDF files, the first step is to extract and de-obfuscate JavaScript codes from the document, for which an effective technique is yet to be created. However, existing static methods cannot de-obfuscate JavaScript codes, existing dynamic methods bring high overhead, and existing hybrid methods introduce high false negatives.

    Therefore, in this paper, we present MPScan, a scanner that combines dynamic JavaScript de-obfuscation and static malware detection. By hooking the Adobe Readers native JavaScript engine, JavaScript source code and opcode can be extracted on the fly after the source code is parsed and then executed. We also perform a multilevel analysis on the resulting JavaScript strings and opcode to detect malware. Our evaluation shows that regardless of obfuscation techniques, MPScan can effectively de-obfuscate and detect 98% malicious PDF samples.

    16 downloads

    0 comments

    Submitted

  23. Deobfuscation of Packed and Virtulazation-Obfuscated Protected Binaries

    Code obfuscation techniques are increasingly being used in software for such reasons as protecting trade secret algorithms from competitors and deterring license tampering by those wishing to use the software for free. However, these techniques have also grown in popularity in less legitimate areas, such as protecting malware from detection and reverse engineering. This work examines two such techniques “packing and virtualization-obfuscation“ and presents new behavioral approaches to analysis that may be relevant to security analysts whose job it is to defend against malicious code. These approaches are robust against variations in obfuscation algorithms, such as changing encryption keys or virtual instruction byte code.

    Packing refers to the process of encrypting or compressing an executable file. This process scrambles the bytes of the executable so that byte-signature matching algorithms commonly used by anti-virus programs are ineffective. Standard static analysis techniques are similarly ineffective since the actual byte code of the program is hidden until after the program is executed. Dynamic analysis approaches exist, but are vulnerable to dynamic defenses. We detail a static analysis technique that starts by identifying the code used to "unpack" the executable, then uses this unpacker to generate the unpacked code in a form suitable for static analysis. Results show we are able to correctly unpack several encrypted and compressed malware, while still handling several dynamic defenses.

    Virtualization-obfuscation is a technique that translates the original program into virtual instructions, then builds a customized virtual machine for these instructions. As with packing, the byte-signature of the original program is destroyed. Furthermore, static analysis of the obfuscated program reveals only the structure of the virtual machine, and dynamic analysis produces a dynamic trace where orig­inal program instructions are intermixed, and often indistinguishable from, virtual machine instructions. We present a dynamic analysis approach whereby all instructions that affect the external behavior of the program are identified, thus building an approximation of the original program that is observationally equivalent. We achieve good results at both identifying instructions from the original program, as well as eliminating instructions known to be part of the virtual machine.

    38 downloads

    0 comments

    Submitted

  24. Deobfuscation of Virtualization-Obfuscated Software

    When new malware are discovered, it is important for researchers to analyze and understand them as quickly as possible. This task has been made more difficult in recent years as researchers have seen an increasing use of virtualization-obfuscated malware code. These programs are difficult to comprehend and reverse engineer, since they are resistant to both static and dynamic analysis tech-techniques. Current approaches to dealing with such code first reverse-engineer the byte code interpreter, then use this to work out the logic of the byte code program. This outside-in approach produces good results when the structure of the interpreter is known, but cannot be applied to all cases. This paper proposes a different approach to the problem that focuses on identifying instructions that affect the observable behaviour of the obfuscated code. This inside-out approach requires fewer assumptions, and aims to complement existing techniques by broadening the domain of obfuscated programs eligible for automated analysis. Results from a prototype tool on real-world malicious code are encouraging.

    30 downloads

    0 comments

    Submitted

  25. Exception Handling to Build Code Obfuscation Techniques

    Microsoft's .NET Framework, and JAVA platforms, are based in a just-in-time compilation philosophy. Software developed using these technologies is executed in a hardware independent framework, which provides a full object-oriented environment, and in some cases allows the interaction of several components written in different programming languages. This flexibility is achieved by compiling into an intermediate code which is platform independent. Java is compiled into ByteCode, and Microsoft .NET programs are compiled into MSIL (Microsoft Intermediate Code). However, this flexibility comes with a price. With freeware tools available in Internet, it is quite easy to decompile intermediate codes and obtain a working, readable version of the source code. Obfuscation is the most accepted and commercially available technique that developers can use to protect their intellectual property In this work, we propose the use of try-catch mechanisms available in .NET as a way to improve the quality of one of the building blocks of obfuscation: opaque predicates.

    17 downloads

    0 comments

    Submitted


×
×
  • Create New...