DOC: spectrum-p
STATUS: ● PUBLISHED
SYSTEM SPECTRUM

The Partition That Existed But Couldn't Be Found

A DDL-visibility bug, and why partition creation has to commit alone.

Cover image — The Partition That Existed But Couldn't Be Found

On the first event of every new month, the pipeline would die with:

no partition of relation "fact_events" found for row

This happened immediately after creating that exact partition. The code that built the partition and the code that couldn’t find it were in the same run, seconds apart. This is the entry on why, and on the PostgreSQL rule that makes it inevitable if you’re not careful.

// 01 — THE SETUP

fact_events is partitioned by month, and partitions are created on demand: the pipeline scans pending events for any month it hasn’t seen and runs warehouse.create_monthly_partition(). Originally this happened inside the same transaction as the batch insert: create the partition, then insert the events, all in one unit of work. Tidy, and wrong.

// 02 — THE SYMPTOM

The first event of a new month failed every time. The partition clearly got created. The code ran, no error. Yet the very next INSERT in the same run insisted no partition existed for the row. A partition that was both there and not there.

// 03 — THE CULPRIT

CREATE TABLE is DDL, and DDL’s effects are only visible to other connections after the creating transaction commits. The pipeline processes each event on its own pooled connection inside its own transaction. The partition was created in transaction A, still open. When process_event() ran on connection B and asked the planner where to route the row, B couldn’t see A’s uncommitted partition, so to B, it didn’t exist yet.

It wasn’t a race in the usual sense. It was visibility: one connection cannot see another’s uncommitted schema changes, by design.

// 04 — THE FIX

Partition creation became Phase 1, its own pool.acquire() block that creates every needed partition and commits before the batch transaction (Phase 2) begins:

# Phase 1 — committed on its own, before any event is processed
async with pool.acquire() as conn:
    for month in pending_months:
        await conn.execute("SELECT warehouse.create_monthly_partition($1)", month)
    # transaction commits here

# Phase 2 — now every connection can see the partitions

By the time events are processed on their own connections, the partitions are committed and visible to all of them. The “no partition found” error never returned.

TAKEAWAYS

NEXT

@frogwebp brand mark
ANTHONY PENA · @FROGWEBP
I build data systems and write about everything around them, the architecture, the failures, what each one teaches me. Documenting in public since 2021: the process, not just the result.

// NEWSLETTER — THE BUILD LOG SIGNAL

When I ship something or learn something worth keeping, it lands here first — build logs, concepts, and the honest process behind them. Come along; no spam, leave anytime.