Advanced AOSP Subsystems
3 min read

OAT Format

Overview

This lesson explores the OAT Format, the native executable format used by the Android Runtime (ART) to execute Ahead-Of-Time (AOT) compiled code. Introduced in Android 5.0 (Lollipop), the OAT format bridges the gap between Dalvik bytecode and native machine code.

OAT File Structure: The ELF Wrapper

At its core, an OAT file is simply a standard ELF (Executable and Linkable Format) shared object file (.so). By wrapping the compiled code inside an ELF file, ART can leverage the Linux kernel's existing dynamic linker (linker or linker64) to map the executable code directly into memory.

Because OAT files are ELF files, standard Linux utilities like readelf, objdump, and nm can be used to inspect their structure.

# Example: Inspecting an OAT file's ELF sections
readelf -S /data/app/~~<random_string>/com.example.app-<random_string>/oat/arm64/base.odex

Key ELF Sections in an OAT File

  1. .rodata (Read-Only Data): Contains the original .dex files (in older Android versions), the OAT header, and metadata linking the compiled methods back to their original DEX definitions.
  2. .text (Executable Code): Contains the actual native machine code (ARM, ARM64, x86) generated by the dex2oat compiler.
  3. .bss (Block Started by Symbol): Used for uninitialized data, such as ART's internal state for resolved strings and classes.

OAT Header and Metadata

The OAT header is stored at the beginning of the .rodata section. It contains crucial information that ART needs to load and manage the optimized code.

// Simplified OAT Header in C++ (AOSP: art/runtime/oat.h)
class OatHeader {
  uint8_t magic_[4];             // "oat\n"
  uint8_t version_[4];           // e.g., "199\0"
  uint32_t adler32_checksum_;    // Checksum of the OAT file
  uint32_t instruction_set_;     // e.g., kArm64, kX86_64
  uint32_t instruction_set_features_bitmap_;
  uint32_t dex_file_count_;      // Number of DEX files compiled into this OAT
  uint32_t executable_offset_;   // Offset to the .text section
  uint32_t key_value_store_size_; // Size of the key-value store containing compiler options
  // ...
};

Following the header is a list of OatDexFile structures, one for each original .dex file compiled. These structures point to OatClass structures, which in turn provide the offsets to the compiled native code for each method.

DEX-to-OAT Compilation Output

During app installation or background optimization, the dex2oat daemon compiles the DEX bytecode into native code. The output is an OAT file (often with an .odex or .oat extension).

The compilation process involves:

  1. Parsing: Reading the .dex files.
  2. Verification: Ensuring the bytecode is safe and valid.
  3. Compilation: Translating DEX instructions to native machine code using the Optimizing Compiler.
  4. Linking: Resolving references and packaging the output into the ELF wrapper.

Depending on the compilation filter (e.g., speed, quicken, verify), the .text section may contain fully compiled native code for all methods, only hot methods, or no native code at all.

Exploring with the oatdump Tool

The oatdump utility is an indispensable tool for reverse engineering and analyzing ART's compilation behavior. It allows you to inspect the contents of an OAT file, view the disassembled native code, and understand the mapping between DEX instructions and machine code.

Using oatdump

You can run oatdump on an emulator or a rooted device:

# Dump the OAT header and class metadata
adb shell oatdump --oat-file=/data/app/~~<random_string>/com.example.app-<random_string>/oat/arm64/base.odex --header-only

# Dump the disassembled native code for a specific method
adb shell oatdump --oat-file=/path/to/base.odex --class=Lcom/example/MyClass; --method=myMethod

When you dump a method, oatdump provides a side-by-side view of the original DEX bytecode and the corresponding native machine code instructions generated by the AOT compiler. This is invaluable for understanding compiler optimizations and debugging performance issues.