The highly interconnected world of computers ever poses the threat of malicious code. Such code can break into hosts using a variety of methods such as attacking known software flaws and vulnerabilities in regular programs. Hence detecting the presence of such malicious code on a given host is a problem of high concern. Whenever such hostile programs succeed in spreading over the internet, there is a significant loss to businesses. For example, mi2g website  quotes that within one quarter the NetSky worm and all itâ€™s A - Q variants put together, had already caused between $35.8 billion and $43.8 billion of estimated economic damages worldwide. The website also quotes that, in March, combined loss due to the three worms Beagle, MyDoom, and NetSky crossed the $100 billion mark within a week.
Programmers obfuscate their code with the intent of making it difficult to discern information from the code. Programs may be obfuscated to protect intellectual property and to increase security of code (by making it difficult for others to identify vulnerabilities) , , . Programs may also be obfuscated to hide malicious behavior and to evade detection by anti-virus scanners , , . Most malicious code writers add or rearrange code in malicious programs to make their detection difficult, if not impossible. Recent virus writing trends that employ obfuscating transformations to conceal their behavior are the most difficult to detect. These viruses are called metamorphic viruses.
The primary goal of obfuscation is to increase the effort involved in manually or automatically analyzing a program. In the context of anti-virus scanning, the context of our study, automated analysis may be performed at the desktop, at quarantine servers in an enterprise, or on back-end machines of an anti-virus company's laboratory . In contrast, manual analysis is performed by engineers in Emergency Response Teams of anti-virus companies and research laboratories. The goal of obfuscation in malicious programs ”virus, worms, Trojans, spy wares, backdoors” is to escape detection by automated analysis and significantly delay detection by manual analysis.
A common obfuscation technique that is found in viruses, henceforth used generically to mean malicious programs, is that they obfuscate call instructions . For instance, the call addr instruction may be replaced by two push instructions and a ret instruction, the firstpush pushing the address of instruction after the ret instruction, the secondpush pushing the address addr. The code may be further obfuscated by spreading the three instructions and by further splitting each instruction into multiple instructions.
Obfuscation of call instructions breaks most static analysis based methods for detecting a virus since these methods depend on recognizing call instructions to (a) identify the kernel functions used by the program and (b) to identify procedures in the code. The obfuscation also takes away important cues that are used during manual analysis. We are then left only with dynamic analysis, i.e., running a suspect program in an emulator and observing the kernel calls it makes. Such analysis can easily be thwarted by what is termed as "picky virus” one that does not always execute its malicious payload. In addition dynamic analyzers must use some heuristic to determine when to stop analyzing a program, for it may not terminate without user input. Virus writers can bypass stopping heuristics by introducing a delay loop that simply wastes cycles. It is therefore important to detect obfuscated calls both for static and dynamic analysis of viruses.