MVP Factory
ai startup development

Diagnosing Android Jank with FrameTimeline API: Surfaceflinger Deadlines, HWUI Thread Contention, and the Systrace Workflow That Pinpoints Exact Recomposition Frames Dropping Below 16ms

KW
Krystian Wiewiór · · 5 min read

TL;DR

Android 12’s FrameTimeline API exposes expected-vs-actual frame deadlines directly in Perfetto traces, so you stop guessing about jank. Correlate SurfaceFlinger deadline misses with HWUI render thread contention and specific Compose recomposition frames to isolate the exact layout or composition pass breaking the 16.6ms budget. Then automate regression detection in CI with Macrobenchmark’s FrameTimingMetric.


The 16ms problem most teams get wrong

After years of building production Android systems, I’m convinced the number one mistake teams make with jank diagnosis is assuming it’s a GPU problem. In practice, most frame drops in Compose-based UIs originate on the main thread or during recomposition, not during draw. The hard part has always been figuring out which frame, which pass, and why.

Before Android 12, you could see dropped frames in systrace, but connecting a specific Choreographer callback to a SurfaceFlinger deadline miss meant manually aligning timelines. Tedious. FrameTimeline fixed that.

How FrameTimeline connects the pipeline

FrameTimeline assigns each frame a unique token that flows through the entire rendering pipeline:

Choreographer.doFrame() → Compose recomposition → HWUI RenderThread → SurfaceFlinger

For every frame, SurfaceFlinger now records:

FieldDescription
expectedPresentTimeThe VSYNC deadline the frame was targeting
actualPresentTimeWhen the frame actually hit the display
jankTypeClassification: None, AppDeadlineMissed, SurfaceFlinger, PredictionError
frameDurationTotal time from start to present

Simple: when actualPresentTime exceeds expectedPresentTime, you missed the deadline. jankType tells you whether your app or the compositor caused it.

The systrace workflow that pinpoints the frame

This is the workflow I use in production.

Start by capturing a Perfetto trace with FrameTimeline enabled:

adb shell perfetto -o /data/misc/perfetto-traces/trace.pb -t 10s \
  -c - <<EOF
buffers: { size_kb: 65536 }
data_sources: { config { name: "android.surfaceflinger.frametimeline" } }
data_sources: { config { name: "android.gpu.memory" } }
EOF

Open it in Perfetto UI and look for the Expected Timeline vs Actual Timeline lanes. They render side-by-side per surface. Red slices are deadline misses.

Click a red frame slice. The detail panel shows the frame token, jank classification, and duration breakdown. If jankType = AppDeadlineMissed, the bottleneck is in your app process.

Now cross-reference the frame token with Choreographer slices. In the app process tracks, find the matching Choreographer#doFrame with the same VSYNC ID. Expand its children and you’ll see compose:recomposition, measure, layout, and draw phases with individual durations.

This is usually where you get your answer. A recomposition triggered by a derivedStateOf recalculation or an unbounded LazyColumn item composition shows up as an abnormally long child slice.

HWUI render thread contention

Even when the main thread finishes under budget, the HWUI RenderThread can miss the SurfaceFlinger deadline on its own. Common culprits:

CauseSignature in traceTypical impact
Large display list syncLong syncFrameState slice2-8ms added
GPU texture uploadUpload slices on RenderThread3-12ms for large bitmaps
Shader compilationcompile slices20-80ms first occurrence
Contention with main threadLock contention markers1-5ms stalls

Shader compilation jank during first-launch animations is a particular headache in Compose. The RenderThread trace will show Pipeline::run with a child ShaderCompile eating well over a full frame budget. I’ve seen 60ms+ hits from a single shader compile on mid-range devices.

Automating regression detection in CI

Manual tracing doesn’t scale. And the most common mistake teams make with performance CI is measuring averages. For jank, you need percentiles.

@get:Rule
val benchmarkRule = MacrobenchmarkRule()

@Test
fun scrollJankMetrics() {
    benchmarkRule.measureRepeated(
        packageName = "com.example.app",
        metrics = listOf(FrameTimingMetric()),
        iterations = 5,
        setupBlock = { pressHome(); startActivityAndWait() }
    ) {
        val list = device.findObject(By.res("item_list"))
        list.setGestureMargin(device.displayWidth / 5)
        list.fling(Direction.DOWN)
    }
}

FrameTimingMetric reports frameDurationCpuMs and frameOverrunMs at P50, P90, P95, and P99. The metric you care about most is frameOverrunMs at P99, which measures how far past the deadline your worst frames land. Set your CI gates around these thresholds:

MetricGreenYellowRed
frameOverrunMs P50< 0ms0-5ms> 5ms
frameOverrunMs P99< 8ms8-16ms> 16ms
frameDurationCpuMs P90< 12ms12-16ms> 16ms

Store results in a time-series database and alert on week-over-week P99 regressions exceeding 3ms. This catches composition-layer regressions before they ship.

What to do with all this

Use FrameTimeline tokens, not averages. Open Perfetto, find the red Actual Timeline slices, trace the frame token back to the exact Choreographer callback. Hours of guessing become a five-minute lookup.

Profile the RenderThread separately. A clean main thread doesn’t guarantee smooth frames. Shader compilation, texture uploads, and display list sync are all independent sources of deadline misses, and they bite hardest on first launch when you least want stutters.

Gate CI on frameOverrunMs P99, not average frame duration. Averages hide the worst frames, and the worst frames are what users actually feel. Macrobenchmark’s FrameTimingMetric gives you real percentile data. Set hard thresholds and fail the build when tail latency regresses.


Share: Twitter LinkedIn