MVP Factory
ai startup development

Fixing Android jank you can't see with Systrace

KW
Krystian Wiewiór · · 5 min read

Meta description: Learn how to use Perfetto and Systrace to find hidden jank in Jetpack Compose apps — RenderThread stalls, recomposition traps, and frame timing fixes.

Tags: android, jetpackcompose, kotlin, architecture, mobile


TL;DR

Most Compose jank never shows up in Android Studio’s basic profiler. The real culprits live in Perfetto traces: RenderThread GPU fence stalls, unstable lambda recompositions, and Choreographer timing gaps. I walk through the patterns that keep 99th-percentile frame times under 16ms, including stability annotations, remember strategies, lazy layout prefetch tuning, and display list batching.


The jank you’re not measuring

Most teams get performance wrong in the same way: they profile the main thread, see green frames, and ship. Meanwhile, users feel the stutter.

Standard frame metrics report average frame times. But perceived smoothness depends on the 99th percentile. A single 34ms frame in a 300-frame scroll ruins the feel. Users don’t care that 299 frames were fast. These outliers hide in places that Android Studio’s layout inspector won’t show you.

Perfetto’s frame lifecycle, explained

Perfetto (the successor to Systrace) exposes the full frame lifecycle. This is the pipeline that matters:

StageThreadWhat happensJank risk
doFrameMainInput, animation, measure/layout/drawRecomposition storms
syncFrameStateMain → RenderDisplay list sync to RenderThreadLarge draw ops
issueDrawCommandsRenderThreadGPU command submissionShader compilation
GPU Completion FenceRenderThreadWait for GPU to finishTexture uploads, overdraw
`

` | SurfaceFlinger | Buffer swap to display | Missed vsync deadline |

The thing that trips people up: your main thread can finish in 4ms and you still drop the frame if the RenderThread GPU completion fence stalls past the vsync boundary. Perfetto shows this as a gap between DrawFrame end and the next doFrame start on the RenderThread track.

RenderThread stalls

RenderThread operates on RenderNode display lists. When Compose generates too many draw operations through deep modifier chains, unbounded Canvas draws, or large image layers, the display list balloons and GPU completion takes longer.

The pattern I’ve seen cause the most trouble in production is first-frame shader compilation. The RenderThread blocks on GrGLGpu::compile while the GPU compiles a shader variant it hasn’t seen before. This shows up as a single 40-80ms spike that is completely invisible in CPU profiling. It happens once, it looks like noise, and teams ignore it. Don’t.

Pre-warm shader paths during splash screen transitions, and keep your RenderNode tree flat. Every nested graphicsLayer creates a new RenderNode with its own display list.

Compose recomposition traps

Compose’s recomposition model is efficient right up until you hand it an unstable lambda or an unmarked data class. Then it falls apart fast. Two patterns dominate:

Unstable lambda captures

// BAD: New lambda instance every recomposition
@Composable
fun ItemRow(item: Item, viewModel: MyViewModel) {
    Button(onClick = { viewModel.onItemClick(item.id) }) {
        Text(item.name)
    }
}

// GOOD: Stable reference via remember
@Composable
fun ItemRow(item: Item, viewModel: MyViewModel) {
    val onClick = remember(item.id) { { viewModel.onItemClick(item.id) } }
    Button(onClick = onClick) {
        Text(item.name)
    }
}

Every unstable lambda makes the Composer mark that scope as changed, which triggers recomposition of the Button and all its children. In a LazyColumn with 50 visible items, that’s 50 unnecessary recompositions per frame during scroll. You can watch it happen in real time with the composition count overlay, and it’s painful.

Missing stability annotations

The Compose compiler skips recomposition for parameters it can prove are stable. Classes from external modules are assumed unstable by default.

// Mark data classes as stable when you control mutation
@Immutable
data class UiItem(val id: String, val title: String, val imageUrl: String)

Use the Compose Compiler metrics report (-P plugin:androidx.compose.compiler.plugins.kotlin:metricsDestination=...) to audit which classes the compiler treats as unstable. I’ve seen scroll jank drop by 60% after stabilizing just three key model classes. Three classes. That’s it.

Lazy layout prefetch tuning

LazyColumn prefetches items ahead of the scroll direction. The default prefetch distance works for simple items, but complex compositions blow past the prefetch frame budget.

Prefetch strategyFrame budget usedScroll smoothness
Default (next item)~4-6ms per prefetchGood for simple items
Custom LazyLayoutPrefetchStrategy(3)~2ms x 3 items spread across framesBetter for complex items
Over-prefetch (10+)Steals budget from visible framesWorse, causes visible jank

The sweet spot is prefetching 2-3 items with compositions that complete within 3ms each. Measure this in Perfetto by filtering to compose:recompose slices during scroll.

Display list batching

Every Modifier.drawBehind or Canvas block generates draw operations in the RenderNode display list. If you have related draws, batch them into a single modifier:

// BAD: Three separate draw passes
Modifier
    .drawBehind { drawRect(backgroundColor) }
    .drawBehind { drawRoundRect(borderColor, cornerRadius = 8.dp.toPx()) }
    .drawBehind { drawCircle(accentColor, radius = 4.dp.toPx()) }

// GOOD: Single draw pass
Modifier.drawBehind {
    drawRect(backgroundColor)
    drawRoundRect(borderColor, cornerRadius = 8.dp.toPx())
    drawCircle(accentColor, radius = 4.dp.toPx())
}

This reduces RenderNode operations and keeps display list serialization under budget.

What to actually do

Profile with Perfetto, not just Android Studio. Filter on the RenderThread track and look for GPU completion fence stalls exceeding 8ms. That’s where your invisible jank lives.

Run the Compose Compiler stability report on every release. Annotate key UI model classes with @Immutable or @Stable, wrap lambdas in remember with proper keys, and audit recomposition counts with the layout inspector’s composition count overlay.

Tune LazyColumn prefetch to 2-3 items and flatten your RenderNode tree. Batch draw operations into single modifiers. Minimize nested graphicsLayer calls. And stop looking at average frame time. Your users feel the 99th percentile.


Share: Twitter LinkedIn