Weighing a truck by loading it onto a scale: a 10× React Native performance regression

A 'feat: Improve photo analysis' commit silently turned a 0.001s synchronous metadata read into a 2-second async image load. A photo library that scanned in 90 seconds now took 17 minutes. The fix, the root cause, and the lesson on reading manifests before lifting cargo.

A few days ago we shipped a commit titled feat: Improve photo analysis. The diff looked harmless. By the next afternoon, our smart-analysis pass on a 26,000-photo library — which had been running in ~90 seconds — was taking 6 to 17 minutes.

Same code path. Same device. Same library. 5 to 10× slower.

This is the post-mortem.

What we saw

The smart-analysis pipeline streams photos through the native module, computes a small set of attributes per asset (size, locality, basic EXIF), and writes them into a SQLite index. We have telemetry on throughput per batch, and it adapts: if recent batches finish quickly, the batch size grows.

After the regression, the telemetry told us a depressing little story:

[PhotoLibraryToolkit] ℹ️ Performance good (66.2 items/s), increasing batch to 49
[PhotoLibraryToolkit] ℹ️ Performance good (67.4 items/s), increasing batch to 73
[PhotoLibraryToolkit] ℹ️ Performance good (69.7 items/s), increasing batch to 109
[PhotoLibraryToolkit] ℹ️ Performance good (68.2 items/s), increasing batch to 163

It thought everything was fine — these numbers are above the “increase batch” threshold. But the historical normal was 150–600 items/s. We were now an order of magnitude under that and the adaptive scheduler had no way to know.

What the commit changed

The offender was a single Swift function: fileInfo(for asset: PHAsset). It returns two values: the asset’s file size, and whether the asset is stored locally (vs. in iCloud).

Before:

private func fileInfo(for asset: PHAsset) -> (Int64, Bool) {
    var size: Int64 = 0
    var isLocal = false

    let resources = PHAssetResource.assetResources(for: asset)
    if let resource = resources.first {
        if let fileSize = resource.value(forKey: "fileSize") as? Int64 {
            size = fileSize
            isLocal = true
        }
    }
    return (size, isLocal)
}

Synchronous. Reads the size from PhotoKit’s resource metadata — a property lookup, basically zero cost. ~0.001s per asset.

After:

private func fileInfo(for asset: PHAsset) -> (Int64, Bool) {
    var size: Int64 = 0
    var isLocal = false
    let semaphore = DispatchSemaphore(value: 0)

    if asset.mediaType == .image {
        let options = PHContentEditingInputRequestOptions()
        options.isNetworkAccessAllowed = false

        asset.requestContentEditingInput(with: options) { input, info in
            if let url = input?.fullSizeImageURL {
                isLocal = true
                do {
                    let attr = try FileManager.default.attributesOfItem(atPath: url.path)
                    if let fileSize = attr[.size] as? Int64 {
                        size = fileSize
                    }
                } catch { /* ... */ }
            }
            semaphore.signal()
        }
    } else if asset.mediaType == .video {
        // Symmetric pattern with requestAVAsset
    }

    _ = semaphore.wait(timeout: .now() + 2.0)  // up to 2 seconds. per asset.
    return (size, isLocal)
}

Same return value. Different means of getting it.

The “improvement” was that the new code is more accurate. requestContentEditingInput resolves the actual file on disk and reports its real size, instead of trusting PhotoKit’s cached metadata (which can occasionally be stale for HEIC files with auxiliary data).

The hidden price tag:

  1. requestContentEditingInput is async. Every asset now pays the cost of an async dispatch + semaphore wait. The 2-second timeout was a worst-case ceiling; the typical case was still expensive.

  2. It loads the entire image into memory to give you that full-size URL. A 10 MB photo loads 10 MB of pixels into memory just so we can ask attributesOfItem for its byte count.

  3. The outer pipeline uses concurrency: 10. Doesn’t help. All ten workers block on their own semaphore wait, then race the iOS memory pressure on ten in-flight image decodes.

It’s the trade I called the post after: instead of reading the truck’s manifest, the new code drove the truck onto a scale.

How we found it

The first telemetry sample after the regression looked confusing because the adaptive batcher kept reporting “performance good”. It thought 67 items/s was a healthy throughput because it had no prior baseline encoded — it was responding to relative deltas, not absolute thresholds.

Once we noticed the absolute number was wrong, the investigation took about 15 minutes:

  1. git log --since="2 days ago" -- packages/expo-photo-library-toolkit/ios/ surfaced four commits.
  2. Three were obviously cosmetic. The fourth was titled feat: Improve photo analysis.
  3. git show 7abade2 -- ios/SmartAnalysisService.swift showed the new fileInfo body.
  4. The phrase DispatchSemaphore(value: 0) in a per-asset function is enough to flinch at, before reading anything else.

The fix

We reverted to the synchronous metadata path, with one improvement carried forward: a fallback estimate for iCloud-only assets where PhotoKit reports a size of zero. The estimate is rough (width × height × 1.5 bytes for typical JPEG/HEIC compression ratios) but it lets the cleanup UI show a meaningful “freeing X GB” total even when assets haven’t been downloaded.

private func fileInfo(for asset: PHAsset) -> (Int64, Bool) {
    var size: Int64 = 0
    var isLocal = false

    let resources = PHAssetResource.assetResources(for: asset)
    if let resource = resources.first {
        if let fileSize = resource.value(forKey: "fileSize") as? Int64 {
            size = fileSize
        } else if let fileSize = resource.value(forKey: "fileSize") as? NSNumber {
            size = fileSize.int64Value
        }
        // iCloud-only files typically have no fileSize or 0.
        isLocal = (size > 0)
    }

    // Fallback estimate for iCloud assets.
    if size == 0 {
        let w = Int64(asset.pixelWidth)
        let h = Int64(asset.pixelHeight)
        size = (w * h * 3) / 2
        isLocal = false
    }

    return (size, isLocal)
}

The 26 K library is back to ~90 seconds. The throughput telemetry is back above 150 items/s.

What we changed in how we work

A revert is easy. The interesting work is making sure the regression couldn’t have shipped silently.

We added three things:

  1. Absolute throughput floors in telemetry, not just relative deltas. Anything under 100 items/s on iPhone 12+ class hardware now logs a warning, even when the adaptive batcher is happy.

  2. A Critical: avoid this API doc inside the native module repo. PHContentEditingInputRequestOptions joins a short list of PhotoKit APIs that are tempting (they look like the “right” abstraction) but are catastrophically expensive for bulk operations. Future commits that import these symbols trigger a lint warning that links to the doc.

  3. A benchmark suite that runs against a synthetic 5 K-asset library as part of CI for the native module. Doesn’t catch every regression, but catches the order-of-magnitude ones — which is the only class big enough to ship a “feat” commit by accident.

The lesson, in one line

Before you call an async API to get a number you could read from metadata: ask whether you’re weighing the truck, or just reading the manifest.

← Back to blog