Overview
The Full Rendering Pipeline is the end-to-end journey of a frame in Android. It traces a pixel from a developer's high-level TextView definition, down through the hardware-accelerated rendering thread, across process boundaries to the system compositor, and finally out to the physical display panel.
Application: View Tree Invalidation
The journey begins on the application's Main (UI) Thread. When an application's state changes (e.g., a button is pressed or text is updated), the affected View calls invalidate().
This invalidation traverses up the View hierarchy, marking regions as "dirty." On the next VSYNC pulse, Choreographer wakes up the UI thread. The UI thread executes measure() and layout() to calculate sizes and positions, followed by the draw() pass.
DisplayList Recording and Replay
In modern Android (using HWUI), the draw() pass does not immediately push pixels. Instead, it records commands.
As the UI thread walks the View tree, it generates a DisplayList for each View. A DisplayList is an optimized sequence of drawing commands (e.g., DrawRect, DrawText, SaveLayer). This process is highly efficient; if a View hasn't changed, its previous DisplayList is simply reused.
Once the entire View hierarchy has been recorded, the UI thread passes the root DisplayList to the RenderThread.
RenderThread: Hardware-Accelerated Drawing
The RenderThread is a dedicated background thread within the application process responsible for communicating with the GPU.
Upon receiving the DisplayList, the RenderThread:
- Dequeues an empty graphic buffer from the
BufferQueue. - Translates the high-level DisplayList commands into low-level OpenGL ES or Vulkan API calls.
- Issues these commands to the GPU, which executes shaders to rasterize the shapes and text into the buffer.
- Queues the filled buffer back into the
BufferQueue, appending a timestamp indicating when it should be displayed.
SurfaceFlinger: Layer Composition
Once the buffer is queued, SurfaceFlinger (the system compositor) is notified via a VSYNC-SF pulse.
SurfaceFlinger wakes up and looks at the BufferQueues for all visible layers across the entire OS (the foreground app, the status bar, the navigation bar). It acquires the latest buffers from these queues. SurfaceFlinger then determines the Z-order (stacking order) and applies any global transformations, such as screen rotation or window scaling.
HWC: Overlay Planes vs GPU Composition
SurfaceFlinger does not immediately composite the frame itself. It prepares a list of layers and asks the Hardware Composer (HWC) HAL to evaluate them.
- Hardware Overlays (Device Composition): If the display hardware supports enough overlay planes, the HWC maps each layer directly to a hardware plane. This consumes zero GPU resources.
- GPU Composition (Client Composition): If there are too many layers or complex visual effects (like blur), the HWC kicks the work back to SurfaceFlinger. SurfaceFlinger then uses the system GPU to squash these layers into a single framebuffer.
Display Output
Finally, the completed frame is presented.
- If Device Composition was used, the display controller reads directly from the individual app buffers and the hardware blends them on the fly as it scans the pixels out to the physical screen.
- If Client Composition was used, the display controller reads the single, composited framebuffer generated by SurfaceFlinger.
The display illuminates, the user sees the updated UI, and the pipeline resets, waiting for the next VSYNC pulse to begin the process anew.