Fixing Android jank you can't see with Systrace
Meta description: Learn how to use Perfetto and Systrace to find hidden jank in Jetpack Compose apps — RenderThread stalls, recomposition traps, and frame timing fixes.
Tags: android, jetpackcompose, kotlin, architecture, mobile
TL;DR
Most Compose jank never shows up in Android Studio’s basic profiler. The real culprits live in Perfetto traces: RenderThread GPU fence stalls, unstable lambda recompositions, and Choreographer timing gaps. I walk through the patterns that keep 99th-percentile frame times under 16ms, including stability annotations, remember strategies, lazy layout prefetch tuning, and display list batching.
The jank you’re not measuring
Most teams get performance wrong in the same way: they profile the main thread, see green frames, and ship. Meanwhile, users feel the stutter.
Standard frame metrics report average frame times. But perceived smoothness depends on the 99th percentile. A single 34ms frame in a 300-frame scroll ruins the feel. Users don’t care that 299 frames were fast. These outliers hide in places that Android Studio’s layout inspector won’t show you.
Perfetto’s frame lifecycle, explained
Perfetto (the successor to Systrace) exposes the full frame lifecycle. This is the pipeline that matters:
| Stage | Thread | What happens | Jank risk |
|---|---|---|---|
doFrame | Main | Input, animation, measure/layout/draw | Recomposition storms |
syncFrameState | Main → Render | Display list sync to RenderThread | Large draw ops |
issueDrawCommands | RenderThread | GPU command submission | Shader compilation |
GPU Completion Fence | RenderThread | Wait for GPU to finish | Texture uploads, overdraw |
| ` |
` | SurfaceFlinger | Buffer swap to display | Missed vsync deadline |
The thing that trips people up: your main thread can finish in 4ms and you still drop the frame if the RenderThread GPU completion fence stalls past the vsync boundary. Perfetto shows this as a gap between DrawFrame end and the next doFrame start on the RenderThread track.
RenderThread stalls
RenderThread operates on RenderNode display lists. When Compose generates too many draw operations through deep modifier chains, unbounded Canvas draws, or large image layers, the display list balloons and GPU completion takes longer.
The pattern I’ve seen cause the most trouble in production is first-frame shader compilation. The RenderThread blocks on GrGLGpu::compile while the GPU compiles a shader variant it hasn’t seen before. This shows up as a single 40-80ms spike that is completely invisible in CPU profiling. It happens once, it looks like noise, and teams ignore it. Don’t.
Pre-warm shader paths during splash screen transitions, and keep your RenderNode tree flat. Every nested graphicsLayer creates a new RenderNode with its own display list.
Compose recomposition traps
Compose’s recomposition model is efficient right up until you hand it an unstable lambda or an unmarked data class. Then it falls apart fast. Two patterns dominate:
Unstable lambda captures
// BAD: New lambda instance every recomposition
@Composable
fun ItemRow(item: Item, viewModel: MyViewModel) {
Button(onClick = { viewModel.onItemClick(item.id) }) {
Text(item.name)
}
}
// GOOD: Stable reference via remember
@Composable
fun ItemRow(item: Item, viewModel: MyViewModel) {
val onClick = remember(item.id) { { viewModel.onItemClick(item.id) } }
Button(onClick = onClick) {
Text(item.name)
}
}
Every unstable lambda makes the Composer mark that scope as changed, which triggers recomposition of the Button and all its children. In a LazyColumn with 50 visible items, that’s 50 unnecessary recompositions per frame during scroll. You can watch it happen in real time with the composition count overlay, and it’s painful.
Missing stability annotations
The Compose compiler skips recomposition for parameters it can prove are stable. Classes from external modules are assumed unstable by default.
// Mark data classes as stable when you control mutation
@Immutable
data class UiItem(val id: String, val title: String, val imageUrl: String)
Use the Compose Compiler metrics report (-P plugin:androidx.compose.compiler.plugins.kotlin:metricsDestination=...) to audit which classes the compiler treats as unstable. I’ve seen scroll jank drop by 60% after stabilizing just three key model classes. Three classes. That’s it.
Lazy layout prefetch tuning
LazyColumn prefetches items ahead of the scroll direction. The default prefetch distance works for simple items, but complex compositions blow past the prefetch frame budget.
| Prefetch strategy | Frame budget used | Scroll smoothness |
|---|---|---|
| Default (next item) | ~4-6ms per prefetch | Good for simple items |
Custom LazyLayoutPrefetchStrategy(3) | ~2ms x 3 items spread across frames | Better for complex items |
| Over-prefetch (10+) | Steals budget from visible frames | Worse, causes visible jank |
The sweet spot is prefetching 2-3 items with compositions that complete within 3ms each. Measure this in Perfetto by filtering to compose:recompose slices during scroll.
Display list batching
Every Modifier.drawBehind or Canvas block generates draw operations in the RenderNode display list. If you have related draws, batch them into a single modifier:
// BAD: Three separate draw passes
Modifier
.drawBehind { drawRect(backgroundColor) }
.drawBehind { drawRoundRect(borderColor, cornerRadius = 8.dp.toPx()) }
.drawBehind { drawCircle(accentColor, radius = 4.dp.toPx()) }
// GOOD: Single draw pass
Modifier.drawBehind {
drawRect(backgroundColor)
drawRoundRect(borderColor, cornerRadius = 8.dp.toPx())
drawCircle(accentColor, radius = 4.dp.toPx())
}
This reduces RenderNode operations and keeps display list serialization under budget.
What to actually do
Profile with Perfetto, not just Android Studio. Filter on the RenderThread track and look for GPU completion fence stalls exceeding 8ms. That’s where your invisible jank lives.
Run the Compose Compiler stability report on every release. Annotate key UI model classes with @Immutable or @Stable, wrap lambdas in remember with proper keys, and audit recomposition counts with the layout inspector’s composition count overlay.
Tune LazyColumn prefetch to 2-3 items and flatten your RenderNode tree. Batch draw operations into single modifiers. Minimize nested graphicsLayer calls. And stop looking at average frame time. Your users feel the 99th percentile.