Jump to content
Tuts 4 You

Disassembling

Disassembling and analysis of code and its practice...

11 files

  1. BIOS Disassembly Ninjutsu Uncovered

    For many years, there has been a myth among computer enthusiasts and practitioners that PC BIOS (Basic Input Output System) modification is a kind of black art and only a handful of people can do it or only the motherboard vendor can carry out such a task. On the contrary, this book will prove that with the right tools and approach, anyone can understand and modify the BIOS to suit their needs without the existence of its source code. It can be achieved by using a systematic approach to BIOS reverse engineering and modification. An advanced level of this modification technique is injecting a custom code to the BIOS binary.
    There are many reasons to carry out BIOS reverse engineering and modification, from the fun of doing it to achieve higher clock speed in overclocking scenario, patching certain bug, injecting a custom security code into the BIOS, up to commercial interest in the embedded x86 BIOS market. The emergence of embedded x86 platform as consumer electronic products such as TV set-top boxes, telecom-related appliances and embedded x86 kiosks have raised the interest in BIOS reverse engineering and modification. In the coming years, these techniques will become even more important as the state of the art bus protocols have delegate a lot of their initialization task to the firmware, i.e. the BIOS. Thus, by understanding the techniques, one can dig the relevant firmware codes and understand the implementation of those protocols within the BIOS binary.
    The main purpose of the BIOS is to initialize the system into execution environment suitable for the operating system. This task is getting more complex over the years, since x86 hardware evolves quite significantly. It’s one of the most dynamic computing platform on earth. Introduction of new chipsets happens once in 3 or at least 6 month. This event introduces a new code base for the silicon support routine within the BIOS. Nevertheless, the overall architecture of the BIOS is changing very slowly and the basic principle of the code inside the BIOS is preserved over generations of its code. However, there has been a quite significant change in the BIOS scene in the last few years, with the introduction of EFI (extensible Firmware Interface) by several major hardware vendors and with the growth in OpenBIOS project. With these advances in BIOS technology, it’s even getting more important to know systematically what lays within the BIOS.
    In this book, the term BIOS has a much broader meaning than only motherboard BIOS, which is familiar to most of the reader. It also means the expansion ROM. The latter term is the official term used to refer to the firmware in the expansion cards within the PC, be it ISA, PCI or PCI Express.
    So, what can you expect after reading this book? Understanding the BIOS will open a new frontier. You will be able to grasp how exactly the PC hardware works in its lowest level. Understanding contemporary BIOS will reveal the implementation of the latest bus protocol technology, i.e. HyperTransport and PCI-Express. In the software engineering front, you will be able to appreciate the application of compression technology in the BIOS. The most important of all, you will be able to carry out reverse engineering using advanced techniques and tools. You will be able to use the powerful IDA Pro disassembler efficiently. Some reader with advanced knowledge in hardware and software might even want to “borrow†some of the algorithm within the BIOS for their own purposes. In short, you will be on the same level as other BIOS code-diggers.
    This book also presents a generic approach to PCI expansion ROM development using the widely available GNU tools. There will be no more myth in the BIOS and everyone will be able to learn from this state-of-the-art software technology for their own benefits.

    207 downloads

    0 comments

    Submitted

  2. Decompilation to Compiler High IR in a Binary Rewriter

    A binary rewriter is a piece of software that accepts a binary executable program as input, and produces an improved executable as output. This paper describes the first technique in literature to decompile the input binary into an existing compiler's high-level intermediate form (IR). The compiler's back-end is then used to generate the output binary from the IR. Doing so enables the use of the rich set of compiler analysis and transformation passes available in mature compilers. It also enables binary rewriters to perform complex high-level transformations, such as automatic parallelization, not possible in existing binary rewriters.
    Certain characteristics of binary code pose a great challenge while translating a binary to a high-level compiler IR; these include the use of an explicitly addressed stack, lack of function prototypes and the lack of symbols. We present techniques to overcome these challenges. We have built a prototype binary rewriter called SecondWrite that uses LLVM, a widely-used compiler infrastructure, as our intermediate IR, and rewrites both x86 binaries. Our results show that SecondWrite accelerates un-optimized binaries by 27% on average for our benchmarks, and maintains the performance of already optimized binaries without any custom optimizations on our part. We also present two case studies for custom improvement "automatic parallelization and security“ to exemplify the beneï¬ts and applications of a binary rewriter using a high IR.

    129 downloads

    0 comments

    Submitted

  3. Decompilers and Beyond

    Disassemblers and debuggers are the two tools that allow reverse engineers to examine binary applications. Without them, binary codes are just sequences of hexadecimal numbers. Since humans are notoriously bad with digits, only superficial analysis can be done without these tools.
    Basically, the job of a disassembler is very simple: it just maps hexadecimal numbers to instruction mnemonics. The output of such a basic disassembler is a listing with instructions. While this mapping is a big step forward and allows the user to decipher the logic of simple programs, it does not scale well. Analysis of any file bigger than a few kilobytes is problematic because instruction mnemonics are not enough to hold higher level information: labels and comments are needed, as well as facilities to change the representation on the fly.

    193 downloads

    0 comments

    Submitted

  4. Differentiating Code from Data in x86 Binaries

    Robust, static disassembly is an important part of achieving high coverage for many binary code analyses, such as reverse engineering, malware analysis, reference monitor in-lining, and software fault isolation. However, one of the major diffculties current disassemblers face is differentiating code from data when they are interleaved. This paper presents a machine learning-based disassembly algorithm that segments an x86 binary into subsequences of bytes and then classiffes each subsequence as code or data. The algorithm builds a language model from a set of pre-tagged binaries using a statistical data compression technique. It sequentially scans a new binary executable and sets a breaking point at each potential code-to-code and code-to-data/data-to-code transition. The classiffcation of each segment as code or data is based on the minimum cross-entropy. Experimental results are presented to demonstrate the effectiveness of the algorithm.

    145 downloads

    0 comments

    Submitted

  5. Disassembly of Executable Code Revisited

    Machine code disassembly routines form a fundamental component of software systems that statically analyze or modify executable programs. The task of disassembly is complicated by indirect jumps and the presence of non-executable data jump tables, alignment bytes, etc. in the instruction stream. Existing disassembly algorithms are not always able to cope successfully with executable files containing such features and fail silently i.e., produce incorrect disassemblies without any indication that the results they are producing are incorrect. This can be a serious problem, since it can compromise the correctness of a binary rewriting tool. In this paper we examine two commonly-used disassembly algorithms and illustrate their shortcomings. We propose a hybrid approach that performs better than these algorithms in the sense that it is able to detect situations where the disassembly may be incorrect and limit the extent of such disassembly errors. Experimental results indicate that the algorithm is quite effective: the amount of code flagged as incurring disassembly errors is usually quite small.

    174 downloads

    0 comments

    Submitted

  6. GPU-Disasm - A GPU-based x86 Disassembler

    Static binary code analysis and reverse engineering are crucial operations for malware analysis, binary-level software protections, debugging, and patching, among many other tasks. Faster binary code analysis tools are necessary for tasks such as analyzing the multitude of new malware samples gathered every day. Binary code disassembly is a core functionality of such tools which has not received enough attention from a performance perspective. In this paper we introduce GPU-Disasm, a GPU-based disassembly framework for x86 code that takes advantage of graphics processors to achieve efficient large-scale analysis of binary executables. We describe in detail various optimizations and design decisions for achieving both inter-parallelism, to disassemble multiple binaries in parallel, as well as intra-parallelism, to decode multiple instructions of the same binary in parallel. The results of our experimental evaluation in terms of performance and power consumption demonstrate that GPU-Disasm is twice as fast than a CPU disassembler for linear disassembly and 4.4 times faster for exhaustive disassembly, with power consumption comparable to CPU-only implementations.

    141 downloads

    0 comments

    Submitted

  7. P-Code Tutorials

    Three varying tutorials on P-Code.

    164 downloads

    0 comments

    Submitted

  8. Reassembleable Disassembling

    Reverse engineering has many important applications in computer security, one of which is retrofitting software for safety and security hardening when source code is not available. By surveying available commercial and academic reverse engineering tools, we surprisingly found that no existing tool is able to disassemble executable binaries into assembly code that can be correctly assembled back in a fully automated manner, even for simple programs. Actually in many cases, the resulted disassembled code is far from a state that an assembler accepts, which is hard to fix even by manual effort. This has become a severe obstacle. People have tried to overcome it by patching or duplicating new code sections for retrofitting of executables, which is not only inefficient but also cumbersome and restrictive on what retrofitting techniques can be applied to.
    In this paper, we present UROBOROS, a tool that can disassemble executables to the extent that the generated code can be assembled back to working binaries without manual effort. By empirically studying 244 binaries, we summarize a set of rules that can make the disassembled code relocatable, which is the key to reassembleable disassembling. With UROBOROS, the disassembly-reassembly process can be repeated thousands of times. We have implemented a prototype of UROBOROS and tested over the whole set of GNU Coreutils, SPEC2006, and a set of other real-world application and server programs. The experiment results show that our tool is effective with a very modest cost.

    153 downloads

    0 comments

    Submitted

  9. The Art of Disassembly

    Paper covering aspects behind the creation of a disassembler.

    231 downloads

    0 comments

    Submitted

  10. Verified Abstract Interpretation Techniques for Disassembling Low-level Self-modifying Code

    Static analysis of binary code is challenging for several reasons. In particular, standard static analysis techniques operate over control flow graphs, which are not available when dealing with self-modifying programs which can modify their own code at runtime. We formalize in the Coq proof assistant some key abstract interpretation techniques that automatically extract memory safety properties from binary code. Our analyzer is formally proved correct and has been run on several self modifying challenges, provided by Cai et al. in their PLDI 2007 paper.

    120 downloads

    0 comments

    Submitted

  11. Visual Basic - A Decompiling Approach

    Frameworks are getting more and more popular today, Visual Basic is one of them. Personally I hate frameworks, and also most reverser's do. So, why this tutorial? We can consider both the light and the dark side of the problem: frameworks usually put a lot of code in the compiled programs, so it becomes hard to find the way among all that jungle. But they also use sets of pre-built objects, so these objects are always the same and can be recognized, helping the reverser to understand the code itself. In a VB PE you have a lot of information inside the exe, so you can easily extract all the information you need about all components of the program. To analyse a VB application I used this program that was written by a friend of mine (thank you _d31m0s_!). It's a sort of name/serial crackme, but we are not interested in serial fishing, we are interested in how it works and how the vb knows how to build the app itself. I asked my friend to write it adding some event handling (colors, on over, etc) and a simple algorithm to check serial. He also wrote the proggy using more source files and making various subs (some null sub too). We also have the source of all, but we will check them later.

    204 downloads

    0 comments

    Submitted


×
×
  • Create New...