Scalability
This page expands Section 8 from Architecture Overview.
Read each control as a guardrail against a specific failure mode: saturation, unfairness, or orchestration overload.
Scalability Model
Horizontal service scaling
Current baseline replicas in app namespace:
- Contestant Service: 3
- Deployment Center: 3
- Challenge Gateway: 2
- Deployment Consumer: 1
- Deployment Listener: 1
- Contestant Portal/Admin Portal: 1 each
HPA (autoscaling/v2) is configured in the default app manifests for selected services:
- Contestant Service: min 3, max 6.
- Deployment Center: min 2, max 5.
- Challenge Gateway: min 2, max 5.
- Contestant Portal: min 3, max 6.
These HPAs currently use CPU and memory utilization targets (averageUtilization: 90).
This supports horizontal API scaling while keeping worker/reconciler semantics controlled.
Async throughput controls
- deployment_queue max length: 300.
- Deployment Consumer prefetch: 40.
- Deployment Consumer batch size: 20.
- max running workflows gate: 30.
- worker polling interval: 2 seconds.
These controls provide bounded admission to avoid workflow storms and cluster saturation.
Raising worker concurrency without adjusting queue limits and workflow caps can destabilize the cluster faster than it improves throughput.
Runtime access scaling
Challenge Gateway capacity controls include:
- global TCP max connections,
- per-IP TCP connections,
- per-token TCP connections,
- HTTP per-token/IP and per-IP rate limiting.
These controls protect challenge infrastructure from abuse and accidental overload.
Team-level fairness controls
- Concurrent challenge cap per team is enforced atomically in Redis ZSET.
- Max deploy count and max attempts are checked per challenge/team.
Fairness is enforced atomically, not best effort.
Current scaling trade-offs
- Scaling is hybrid: HPA is enabled for selected stateless services, while worker/reconciler components remain manually scaled for predictable orchestration and reconciliation behavior.
- Single Deployment Consumer simplifies ordering and control, but can become a throughput bottleneck if event size grows sharply.
Validate throughput settings with quick and race tests before event day, then freeze critical scaling knobs during the contest window.