Server-driven paywall A/B testing that moves revenue

TL;DR

Most teams A/B test paywall conversion rate, ship the “winner,” and watch revenue stay flat. Conversion rate without discount depth is a vanity metric. This post covers the full architecture: RevenueCat custom placements for server-driven paywalls, feature flag integration for cohort targeting, platform-specific exit offer triggers on Android and iOS, and the statistical framework that tests revenue-per-user instead of conversion rate. Includes sample size math, sequential testing to avoid peeking problems, and cohort isolation for small-audience apps.

The architecture: server-driven paywalls

You want to change what users see on the paywall (offer tiers, copy, discount depth, exit intent triggers) without shipping an app update. The pipeline:

[RevenueCat Offerings + Custom Placements]
        ↓
[Feature Flag Service (LaunchDarkly / Statsig)]
        ↓  cohort assignment + payload
[Client SDK fetches placement config]
        ↓
[Render paywall variant → track events → measure LTV]

RevenueCat’s custom placements let you define named paywall surfaces (main_paywall, exit_offer, upgrade_nudge) and map each to a specific offering remotely. Combine this with a feature flag service that assigns users to experiment cohorts, and you control the entire presentation layer from your dashboard.

The client code stays thin. On Android with Kotlin:

val placement = Purchases.sharedInstance.getCustomPlacement("exit_offer")
val offering = placement?.availablePackages ?: return
// Render server-defined paywall variant

No hardcoded product IDs. No app update to test a new discount tier.

Exit offers: platform-specific triggers

Exit offers fire when a user signals intent to leave the paywall. Detection differs quite a bit across platforms.

Signal	Android	iOS
Back navigation	`OnBackPressedCallback` via `BackHandler`	`UIAdaptivePresentationControllerDelegate.presentationControllerDidAttemptToDismiss`
Swipe dismiss	N/A (back gesture covers this)	`UISheetPresentationController` delegate callbacks
Lifecycle-aware timeout	`Lifecycle.Event.ON_PAUSE` after threshold	`viewWillDisappear` with timer validation
Trigger control	Server flag: `exit_offer_enabled`	Same flag, shared config

One thing that will bite you: on iOS with StoreKit 2, subscription offer eligibility (isEligibleForIntroOffer) is async and user-specific. On Android with Google Play Billing Library 7, offer eligibility lives in ProductDetails.SubscriptionOfferDetails. You must pre-fetch eligibility before showing the exit offer. A 300ms delay on an exit intent screen kills the interaction.

The statistical framework that actually works

Most teams test conversion rate as the primary metric. This is the wrong metric.

Consider two variants:

Variant	Conversion Rate	Avg Discount	Revenue Per User
A (no discount)	3.2%	0%	$1.92
B (50% off annual)	5.8%	50%	$1.45

Variant B “wins” on conversion. Variant A generates 32% more revenue per user exposed. I’ve seen teams ship Variant B and then spend quarters trying to figure out why MRR didn’t move.

Your primary metric should be revenue-per-user (RPU): total revenue generated divided by total users exposed to the paywall, including non-converters.

Sample size and sequential testing

RPU has high variance (coefficient of variation ~3-5x for typical subscription apps), so you need much larger samples than conversion rate tests. A rough formula:

n = (2 * (Z_α/2 + Z_β)² * σ²) / δ²

For a 10% RPU lift detection at 80% power and 95% confidence with high-variance revenue data, expect needing 5,000-10,000 users per variant minimum.

The peeking problem: checking results daily and stopping when you see significance inflates your false positive rate from 5% to over 25%. Use sequential testing, either a Bayesian approach with credible intervals or group sequential methods with O’Brien-Fleming spending functions. Statsig handles this natively. With LaunchDarkly, you’ll need to implement the stopping rules yourself or export to a proper experimentation platform.

Cohort isolation in small-audience apps

For apps with smaller user bases (I run into this with niche productivity tools like HealthyDesk, which is a break reminder and desk exercise app for developers), experiment contamination is a real risk. A user who sees the exit offer in one session and the control in another pollutes both cohorts.

The fix: assign cohorts at the user level, persist the assignment in RevenueCat subscriber attributes, and use that as the source of truth across sessions.

Purchases.sharedInstance.setAttributes(
    mapOf("experiment_cohort" to flagService.getCohort(userId))
)

Event taxonomy: connecting impressions to LTV

Your event pipeline needs these minimum events to close the loop:

Event	Properties	Purpose
`paywall_impression`	`placement_id`, `variant`, `cohort`	Denominator for RPU
`exit_offer_triggered`	`trigger_type`, `variant`	Exit funnel tracking
`purchase_initiated`	`product_id`, `offer_type`, `discount_pct`	Conversion + discount depth
`purchase_completed`	`revenue`, `currency`, `is_trial`	Revenue attribution
`subscription_renewed`	`period`, `revenue`	LTV calculation

Without discount_pct on the purchase event, you can’t decompose whether a revenue change came from volume or price. Non-negotiable.

What to do with all this

Test RPU, not conversion rate. When discount depth varies across variants, conversion rate decouples from revenue. Wire revenue-per-exposed-user as your primary metric from day one.

Pre-fetch offer eligibility before exit triggers fire. StoreKit 2 and Play Billing Library 7 handle subscription offer eligibility differently. Cache it when the paywall loads, not when the exit offer appears.

Isolate cohorts at the user level, not the session level. Persist experiment assignments in RevenueCat subscriber attributes and enforce them across sessions. For small-audience apps, contamination will destroy your statistical power faster than insufficient sample size will.

Server-driven paywall A/B testing that moves revenue

Server-driven paywall A/B testing that moves revenue

TL;DR

The architecture: server-driven paywalls

Exit offers: platform-specific triggers

The statistical framework that actually works

Sample size and sequential testing

Cohort isolation in small-audience apps

Event taxonomy: connecting impressions to LTV

What to do with all this

Related Posts

ARM NEON SIMD Intrinsics for Mobile Text Embedding: Building a Sub-10ms Semantic Search Pipeline That Runs Entirely On-Device

Speculative Decoding on Mobile GPUs: Running Draft-Verify LLM Pipelines on Android with Vulkan Compute and Dynamic Batch Scheduling

CRDTs for Offline-First Mobile Sync: Automerge in Kotlin Multiplatform, Vector Clocks on Constrained Devices, and the Conflict-Free Data Layer That Eliminates Your Backend Sync Service