Scalability

This page expands Section 8 from Architecture Overview.

Scaling interpretation

Read each control as a guardrail against a specific failure mode: saturation, unfairness, or orchestration overload.

Scalability Model

Current baseline replicas in app namespace:

HPA (autoscaling/v2) is configured in the default app manifests for selected services:

These HPAs currently use CPU and memory utilization targets (averageUtilization: 90).

This supports horizontal API scaling while keeping worker/reconciler semantics controlled.

These controls provide bounded admission to avoid workflow storms and cluster saturation.

Bounded admission is non-optional

Raising worker concurrency without adjusting queue limits and workflow caps can destabilize the cluster faster than it improves throughput.

Challenge Gateway capacity controls include:

These controls protect challenge infrastructure from abuse and accidental overload.

Fairness is enforced atomically, not best effort.

Scaling is hybrid: HPA is enabled for selected stateless services, while worker/reconciler components remain manually scaled for predictable orchestration and reconciliation behavior.
Single Deployment Consumer simplifies ordering and control, but can become a throughput bottleneck if event size grows sharply.

Contest readiness test

Validate throughput settings with quick and race tests before event day, then freeze critical scaling knobs during the contest window.