DOC: spectrum-u
STATUS: ● PUBLISHED
SYSTEM SPECTRUM

"Unknown" Was the Most Dangerous Row in My Warehouse

Why a synthetic fallback bucket quietly corrupts tenant metrics.

Cover image — "Unknown" Was the Most Dangerous Row in My Warehouse

Anonymous events, ones with no account_name, were being filed under a real account named “Unknown.” It looked harmless. It was the most dangerous row in the warehouse.

// 01 — THE SETUP

Events arrive with an optional account_name. The pipeline upserts an account dimension and links each event to it. The original code had a reasonable-looking fallback for missing names:

account_name = payload.get("account_name") or "Unknown"

// 02 — THE SYMPTOM

Every anonymous event got upserted into dim_accounts as account_name = 'Unknown' with a real account_id. So all anonymous traffic, from everywhere, all tenants, all time, merged into a single fake account that then showed up in every dashboard as if it were a customer. Worse: any real company actually named “Unknown” would silently fuse with all of that anonymous traffic and have its metrics destroyed.

// 03 — THE CULPRIT

The or "Unknown" mapped a missing value to a present one. In a dimensional model, that’s the cardinal sin: it invents a member that doesn’t exist and attributes real events to it. A null (“we don’t know the account”) was being laundered into a fact (“the account is Unknown”).

// 04 — THE FIX

Let absence stay absent:

account_name = payload.get("account_name") or None

When account_name is None, account_id is left NULL and no dim_accounts row is created. Anonymous events are simply excluded from account-level aggregations rather than polluting a fake bucket. The dashboards now answer “conversion by account” using only events that actually have an account. That is the only honest answer.

TAKEAWAYS

NEXT

@frogwebp brand mark
ANTHONY PENA · @FROGWEBP
I build data systems and write about everything around them, the architecture, the failures, what each one teaches me. Documenting in public since 2021: the process, not just the result.

// NEWSLETTER — THE BUILD LOG SIGNAL

When I ship something or learn something worth keeping, it lands here first — build logs, concepts, and the honest process behind them. Come along; no spam, leave anytime.