Moving a chunk-size slider in the Streamlit dashboard didn’t just change that one ingestion. It changed the chunk size for everything in the process, including code paths that should have used the .env defaults. The sidebar was reaching into global state.
// 01 — THE SETUP
The dashboard has sidebar sliders for chunk size and overlap. The ingestion handler applied them before chunking, so a user could tune retrieval interactively.
// 02 — THE SYMPTOM
Setting the slider to 1024 made every subsequent operation in that Python process use 1024, not just the current ingestion. Any module that later read the chunking config got the sidebar’s value, not the configured default. Configuration was bleeding across requests and across modules.
// 03 — THE CULPRIT
The handler applied the slider by writing to the shared settings object:
settings.chunking.chunk_size = chunk_size # mutates the global singleton
Pydantic BaseSettings instances are mutable by default, and settings is a process-wide singleton. Writing to it didn’t apply a value for this request. It permanently rewrote the global, so every later reader in the same process inherited the sidebar’s choice.
// 04 — THE FIX
Never write to the global. Pass sidebar values into a locally-constructed splitter and leave settings untouched:
splitter = RecursiveCharacterTextSplitter(
chunk_size=sidebar_chunk_size,
chunk_overlap=sidebar_overlap,
)
chunks = chunk_page(text, splitter=splitter) # local instance, no global write
The sidebar now configures this operation only; the shared settings stay the .env source of truth.
TAKEAWAYS
- Mutable global singletons + per-request config = cross-request bleed. One user’s slider becomes everyone’s default.
- Apply request-scoped configuration to local instances, never by writing back into a shared settings object.
- Pydantic
BaseSettingsis mutable by default. Treat it as read-only after load, or you’ll mutate it by accident.
NEXT
- Anomaly log: why my .env was silently ignored.
