Code Reuse Attacks (CRAs), wh
Control Flow Integrity (CFI) is a term used for computer security techniques which prevent CRAs by monitoring a program’s flow of execution (control flow). CFI techniques do not aim to prevent the sources of attacks, but instead rely on monitoring a program at runtime to catch deviations from the normal behavior. CFI can detect a wide range of attacks, since no matter which attack vector is used, most attacks will eventually cause abnormal program behavior, such as an unexpected control flow change or the execution of tampered instructions.
An effective defense against code injection attacks is non-executable (NX) memory [1], a.k.a W ⊕ X (Write XOR eXecute), a.k.a Data Execution Prevention (DEP). An NX bit is assigned to each page to mark it as either readable and executable, or non-executable but writable. Most high-end modern processors have architectural support for W ⊕ X , while most low-end processors do not. This protection mechanism was circumvented by the invention of code reuse attacks (CRAs), which do not require any code to be injected, but instead uses the existing software for malicious purposes. An example of this is the return-to-libc attack, where the attacker updates the return address to force the currently executing function to return into an attacker-chosen routine. In addition, the attacker places function arguments on the stack, thereby providing him with the ability to supply attacker chosen arguments to a function. While the attacker could return anywhere, libc is convenient since it is included in most C programs. A popular libc attack target is to spawn a shell by returning into system("/bin/sh"), or to disable W ⊕ X by returning into mprotect()/VirtualProtect().
Return oriented programming (ROP) [49] is a powerful CRA which is Turing-complete1 . ROP makes use of code gadgets present inside the existing program. Each gadget consists of a code fragment that ends with a return instruction. The attacker overwrites the stack with a carefully constructed sequence of gadget arguments and gadget addresses. The goal is to invoke a chain of gadgets, with each return instruction leading to the invocation of the next gadget. After the initial function return, the first gadget is executed, leading to the eventual execution of a return, which causes the next gadget to be executed.
Another type of attack, called Jump oriented programming (JOP) [8], is also Turing-complete. The building blocks are also called gadgets, but here each gadget ends with an indirect branch instead of a ret instruction. A dispatch table is used to hold gadget addresses and data, and a CPU register acts as a virtual program counter which points into the dispatch table. A special gadget, called a dispatcher gadget, is used to execute the gadgets inside the dispatch table in sequence. After invoking each functional gadget, the dispatcher gadget is invoked, which advances the virtual program counter and then launches the next functional gadget.
Code randomization protects against CRAs by placing the base addresses of various segments (.text, .data, .bss, etc) at randomized memory addresses. This makes it difficult for attackers to predict target addresses, and is currently used in one form or another in most modern OSs. The security of this technique relies on the low probability of an attacker guessing the randomly placed areas. Therefore, a larger search space means more effective security. The Linux PaX project introduced a code randomization technique, called Address Space Layout Randomization (ASLR), with a patch for the linux kernel in July 2001. ASLR suffers from two main problems. First, the effectiveness of this technique is limited on 32-bit architectures due to the low number of bits available for randomization, which makes the system vulnerable to brute force attacks [50]. Second, it is vulnerable to memory disclosure attacks, since only the base addresses of each segment is randomized. If an attacker gains knowledge of a single address, he could compute the segment base address, which causes the system to again become vulnerable to CRAs. A method to disclose a memory address is by exploiting a format string vulnerability.
Non-control data attacks [13] rely on corrupting data memory which is not directly used by control flow transfer instructions. In the past, it was assumed that non-control data attacks were limited to data leakage (e.g., HeartBleed [22]) or data corruption. However, recently this attack vector was used to launch two different Turing-complete attacks which can circumvent CFI policies (see Section 5.1.1 and Section 5.3.3).