Three systems, shipped and operational. Each codex is the complete build record: architecture decisions, anomalies, fixes, and the reasoning behind every one. This is the portfolio, written as documentation.
// SYSTEM 01 — SPECTRUM
PROJECT CODEX · ANALYTICS WAREHOUSE
A self-hosted Amplitude / Mixpanel built entirely on PostgreSQL 16 and async Python, with no Kafka, no Spark, no ClickHouse. Clickstream events land in a JSONB staging zone in under 5ms, an idempotent ELT pipeline claims them with FOR UPDATE SKIP LOCKED, upserts five dimensions, inserts into a monthly-partitioned fact table, and refreshes four analytics materialized views: funnels, retention cohorts, revenue, A/B experiment results with Wilson confidence intervals. Grafana reads the analytics schema directly.
// BUILD LOG — CHRONOLOGICAL
// SYSTEM 02 — PHRONIS
PROJECT CODEX · AI AGENT OBSERVABILITY
Real-time observability and circuit breaking for AI agents. When an agent misbehaves through tool-call storms, runaway spend, or output drift, streaming SQL detects it in under 500ms and trips a circuit breaker that halts the agent in ~600ms total. Existing tools batch their aggregations every 30–60 seconds; by then a runaway agent has made thousands of calls. Redpanda ingests at ~10ms p99, RisingWave TUMBLE windows detect, Iceberg on MinIO keeps cold history, and an MCP server lets Claude query it all in plain English.
// BUILD LOG — CHRONOLOGICAL
// SYSTEM 03 — CONTEXTFLOW
PROJECT CODEX · SEMANTIC SEARCH ETL
An ETL pipeline that turns multilingual PDFs into a local semantic index. Built for the hard case: LaTeX-compiled academic papers, IPA symbols, broken unicode. Pages stream through extraction and cleaning, chunk at 512 chars with overlap, embed into 384-dim vectors locally (zero API cost), and upsert into ChromaDB with deterministic SHA-256 IDs; re-running never duplicates. Query in plain English, get ranked answers with page-level provenance in under 2 seconds. Airflow orchestrates; Streamlit serves the RAG dashboard.
// BUILD LOG — CHRONOLOGICAL