ADR 0011: Stateless API runtime and graceful lifecycle for horizontal scaling
- Status: Accepted
- Date: 2026-04-15
Context
Publaryn is intended to scale horizontally across multiple API replicas and independently scaled worker pools. The current architecture already externalizes durable state into PostgreSQL, S3-compatible object storage, Redis, and a dedicated search service.
However, horizontal scaling also depends on explicit runtime lifecycle behavior:
- replicas must be safe to add and remove at any time
- orchestrators must have a reliable way to determine liveness versus readiness
- rolling updates must not cut off in-flight publish or management requests abruptly
- no critical workflow may depend on local container filesystem state
Without these guarantees, Kubernetes or other container platforms can route traffic to instances that are not ready, or terminate replicas mid-request during scale-down and deployment rollouts.
Decision
API runtime model
Publaryn API instances are stateless application replicas. Durable state remains external to the process:
- PostgreSQL for metadata, governance state, and audit data
- S3-compatible object storage for immutable artifacts and blobs
- Redis for shared cache, coordination helpers, and future session/rate-limit data
- a dedicated search service for derived discovery views
No critical runtime state may be stored only on the local container filesystem.
Probe semantics
Publaryn distinguishes liveness from readiness:
/healthis the process liveness probe and returns200 OKwhile the process is alive./readinessis the traffic readiness probe for database-backed requests and returns200 OKonly when PostgreSQL connectivity succeeds and, when Redis is configured for the deployment, Redis connectivity succeeds; otherwise it returns503 Service Unavailable.
This allows orchestrators to keep a live process running while removing an unready replica from service.
Graceful shutdown
The API server must handle SIGTERM and Ctrl+C with graceful shutdown. When a shutdown signal is received, the server stops accepting new connections and allows in-flight requests to finish before exiting.
Container deployments should provide a grace period long enough for ordinary management and publish requests to complete. The local Compose baseline uses a 30s stop grace period.
Consequences
- API replicas can participate in rolling updates and horizontal scale-down without introducing avoidable request loss.
- Readiness failures now communicate PostgreSQL failure and, when Redis-backed features are configured, Redis failure correctly to load balancers and orchestrators.
- The runtime contract stays aligned with the modular-monolith architecture: strong transactional consistency in shared stores, but no single-node dependency in the API tier.
- Future worker processes should follow the same lifecycle rules, including signal-aware draining and idempotent retry behavior.
- Broader dependency readiness checks for object storage, Redis, or search can be added later if a specific endpoint class requires them to be treated as hard readiness dependencies.