Vertical scaling (same server)
Add more workers
Each worker handles one call at a time. To increase capacity, add more worker instances:- Create new systemd instances and add sockets to nginx:
- Add new sockets to nginx upstream:
- Reload nginx and update
.env:
Memory budget: each idle worker uses ~260 MB. On an 8 GB server, 16 workers is the safe maximum (4 GB baseline + headroom for active calls). Add more RAM before adding more workers.
Horizontal scaling (multiple servers)
Add a new fleet server
Adding a fleet touches two independent paths — inbound and outbound are wired separately. Do both, or the new fleet only serves half of your traffic.Deploy CXB Core
Set up CXB Core on the new server with the same code,
.env, and systemd template. Start 16 workers. Confirm https://<new-host>/health/fleet responds before continuing.Configure fleet nginx
Set up nginx with the same Unix socket upstream pattern (
max_conns=1 per socket), the 429-retry blocks for /attach, /livekit/dialout, /livekit/widget, and SSL.(1) Inbound — register with the WSS ingress
If inbound WebSocket calls (telephony dialler, Exotel) are fronted by the HAProxy ingress, run
add-fleet.sh <new-host> on the ingress host. It clones the last server line, validates with haproxy -c, reloads zero-downtime, and rolls back automatically if the new backend does not come UP. Do not hand-edit HAProxy server lines. See WSS ingress.