Understanding Application Not Responding (ANR)
In Android, responsiveness is critical. If an application blocks the main UI thread for too long, the system assumes the application is stuck and prompts the user with an "Application Not Responding" (ANR) dialog. For system apps and background services, ANRs might not always show a dialog but will silently terminate the process or log severe warnings, disrupting system stability.
ANR analysis is a core competency for AOSP developers, requiring a deep understanding of thread synchronization, Binder IPC, and system server interactions.
ANR Trigger Conditions
The Android system server (specifically ActivityManagerService and InputManagerService) continuously monitors the responsiveness of applications. An ANR is triggered under specific timeout conditions:
1. Input Dispatching Timeout (5 seconds)
If the application fails to respond to an input event (such as a key press or screen touch) within 5 seconds, an ANR occurs. The input dispatcher sends the event to the app's event queue. If the app's main thread is busy doing heavy work or is deadlocked, it cannot dequeue and process the event.
2. Broadcast Receiver Timeout (10 seconds / 60 seconds)
If a BroadcastReceiver executing in the foreground does not finish executing its onReceive() method within 10 seconds, an ANR is triggered. For background broadcasts, the timeout is relaxed to 60 seconds.
3. Service Timeout (20 seconds / 200 seconds)
If a Service takes more than 20 seconds to execute its lifecycle callbacks (like onCreate(), onStartCommand(), or onBind()) in the foreground, it triggers an ANR. Background services have a generous timeout of 200 seconds.
4. ContentProvider Timeout
While less common, Content Providers can also trigger ANRs if a remote process takes too long to acquire a provider or perform a database query.
ANR Trace Files
When an ANR occurs, the ActivityManagerService dumps the stack traces of the offending process, the system server, and a few other critical processes to help developers diagnose the issue.
Historically, this file was located at /data/anr/traces.txt. In modern Android versions, the traces are written to individual files within the /data/anr/ directory, typically named anr_<timestamp>_<pid>.
You can extract these traces using adb:
adb pull /data/anr/ .
Or use the bugreport tool, which packages all ANR traces along with system logs:
adb bugreport my_bugreport.zip
Reading Thread Dumps in ANR Traces
The core of ANR analysis involves reading the Dalvik/ART thread dumps. An ANR trace provides a snapshot of every thread in the application at the moment the system declared it unresponsive.
A typical thread dump snippet looks like this:
"main" prio=5 tid=1 Blocked
| group="main" sCount=1 dsCount=0 flags=1 obj=0x72a9b238 self=0xb400007b8a74e310
| sysTid=12345 nice=-10 cgrp=default sched=0/0 handle=0x7dbf1cb4f8
| state=S schedstat=( 123456789 987654321 42 ) utm=12 stm=3 core=0 HZ=100
| stack=0x7fc1a00000-0x7fc1a02000 stackSize=8MB
| held mutexes=
at com.example.app.DatabaseHelper.queryData(DatabaseHelper.java:42)
- waiting to lock <0x01234567> (a java.lang.Object) held by thread 14
at com.example.app.MainActivity.onResume(MainActivity.java:55)
at android.app.Activity.performResume(Activity.java:8305)
at android.app.ActivityThread.performResumeActivity(ActivityThread.java:4764)
Key attributes to inspect:
"main": The name of the thread. Always start your analysis with the main thread.prio: Thread priority.tid: ART internal thread ID.sysTid: Linux thread ID (matches pid for the main thread).State:Runnable(executing code),Sleeping(e.g.,Thread.sleep),Waiting(waiting on a monitor/condition), orBlocked(waiting to acquire a lock).
Main Thread Stuck Analysis
To find the root cause, locate the main thread and observe its state.
Scenario A: Main Thread is Blocked
If the main thread is Blocked, it means it is trying to enter a synchronized block but another thread already holds the lock.
Look at the stack trace:
- waiting to lock <0x01234567> (a java.lang.Object) held by thread 14
Next, search the trace file for tid=14 (or whatever thread ID is listed). Analyze what thread 14 is doing. If thread 14 is performing a long running network request while holding the lock, you have found your root cause: lock contention delaying the main thread.
Scenario B: Main Thread is Runnable (Doing Heavy Work)
If the main thread is Runnable, it was actively executing code when the ANR occurred.
Look at the top of the stack. If it is deep inside JSON parsing, bitmap resizing, or an accidental infinite loop, the fix is to move that work to a background thread (e.g., using Kotlin Coroutines, RxJava, or ExecutorService).
Binder Deadlock Detection in ANR
A common and complex source of ANRs in AOSP is Binder deadlocks. This occurs when Process A makes a synchronous Binder call to Process B, but Process B is blocked waiting for a Binder call to Process A.
When this happens, the ANR trace will show threads in a Native state, often blocked in native Binder communication methods like IPCThreadState::waitForResponse.
"main" prio=5 tid=1 Native
| state=S
at android.os.BinderProxy.transactNative(Native method)
at android.os.BinderProxy.transact(BinderProxy.java:540)
at android.app.IActivityManager$Stub$Proxy.getTasks(IActivityManager.java:4523)
To resolve this, you must correlate traces across multiple processes. If App A's main thread is waiting on SystemServer, you must open the SystemServer trace and find the Binder thread processing App A's request. If that SystemServer thread is waiting on a lock held by another thread, follow the chain until you find the true bottleneck.
ANR in System Processes
When an ANR happens in a core system process (like system_server), the stakes are much higher. The Watchdog service continuously monitors critical locks within system_server (such as the ActivityManager lock and WindowManager lock).
If a thread holds one of these critical locks for longer than 1 minute, the Watchdog intentionally triggers a fatal crash to restart system_server (causing a soft reboot) rather than leaving the device in a perpetually unresponsive state.
These instances will generate a tombstone and an ANR trace. Analysis requires tracing exactly who acquired the lock (often by inspecting the Held mutexes section or looking for - locked <0x...> in the traces) and why they failed to release it promptly.