calls.cxbridge.io fronting both fleet.cxbridge.io and fleet2.cxbridge.io.
All ingress assets live in the Core service repo under infra/haproxy/ and are brand-portable — nothing client-specific is baked into the templates.
Routing scope
The ingress fronts only long-lived WebSocket call routes. Short POST routes are deliberately rejected because they return before the call pipeline finishes — connection-count load balancing does not represent worker occupancy for them.| Route | Behavior |
|---|---|
/ws/{bot_id} | Proxied to a fleet (telephony dialler WebSocket) |
/exotel/{bot_id} | Proxied to a fleet (Exotel Voicebot applet) |
/haproxy-health | Local probe, returns 200 ok |
/attach, /livekit/dialout, /livekit/widget, /livekit/dispatch | Rejected (404) — these use CXB API/CXB Dialler app-level fleet selection via /health/fleet |
| anything else | 404 from HAProxy |
Capacity-aware health check
leastconn plus per-server maxconn 16 mirrors the per-host worker count, but maxconn only counts sessions HAProxy itself routed. Direct fleet-URL traffic, /attach, and /livekit/* bypass the ingress, so real worker occupancy can exceed HAProxy’s count.
To avoid routing into a saturated fleet, the health check inspects the /health/fleet JSON body:
fleet_available reaches 0 the body stops matching, HAProxy marks that backend DOWN regardless of its own session count, and leastconn routes only to fleets with real capacity left.
Check timing and flapping tradeoff
Fleetserver lines use check inter 1s fall 2 rise 2: a fleet must fail two consecutive 1s probes (~2s) before ejection, and pass two before returning. This is not hair-trigger (fall 1) on purpose:
- With only two fleets, ejecting on a single transient slow
/health/fleetresponse dumps all carrier load onto the one survivor — manufacturing the exact saturation the check was meant to prevent. Fast ejection + low fleet count produces flapping cascades.fall 2smooths single-probe blips. - The cost: a ~2s window where a newly-saturated fleet can still receive a session or two before ejection. Those land at fleet nginx and return 429, which the carrier sees as-is.
- With a third fleet,
fall 1becomes safer because a single survivor is no longer the failure mode — revisit then.
Retries and 429
option redispatch + retry-on conn-failure empty-response response-timeout retries a different fleet on connection-level failures.
TLS and cert renewal
The:443 bind asserts a TLS floor of TLSv1.2 explicitly rather than relying on the OS OpenSSL policy. Cipher selection is left to the OpenSSL default to avoid rejecting a carrier’s TLS stack.
Cert renewal uses HTTP-01 via HAProxy — no downtime:
- Cron triggers
certbot renew. - Certbot starts a temporary listener on
127.0.0.1:8888. - HAProxy’s
:80frontend routes/.well-known/acme-challenge/*to that listener. - Let’s Encrypt validates, certbot writes the new cert.
- The deploy hook (
renewal-hook.sh) atomically rebuilds the combined/etc/haproxy/certs/<domain>.pemand reloads HAProxy.
authenticator = standalone and http01_port = 8888 in /etc/letsencrypt/renewal/<domain>.conf. Verify with certbot renew --dry-run.
Tooling
| Script | Purpose |
|---|---|
render.sh <domain> [fleet-file] [out-dir] | Renders the brand-agnostic templates for one deployment, substituting __INGRESS_DOMAIN__ and __FLEET_SERVERS__ (from fleet-servers.txt). Writes haproxy.cfg + the renewal hook, guards against surviving placeholders, and runs a structural haproxy -c (cert + DNS excluded, validated on the target host at install). Safe to run on CI/laptop. |
add-fleet.sh [--dry-run] <hostname> | Adds an inbound fleet to the live ingress: pre-checks (root, DNS, /health/fleet), clones the last server line so the new fleet inherits the exact flags, backs up, validates, reloads zero-downtime, and waits for the backend to read UP — rolling back automatically on any failure. |
Do not hand-edit
server lines in /etc/haproxy/haproxy.cfg — use add-fleet.sh. It exists precisely so ops don’t have to understand HAProxy syntax, SNI, or the health-check coupling.Add a fleet — two paths
Adding a fleet touches two independent paths. Do both.Provision and deploy
Deploy CXB Core on the new host (standard playbook). Confirm
https://<new-host>/health/fleet responds.(1) Inbound WSS — add-fleet.sh
On the ingress host, run
./add-fleet.sh fleet3.cxbridge.io (or --dry-run to preview). The script clones, validates, reloads, and rolls back on failure. Keep fleet-servers.txt in the repo in sync (append the same line) so a future re-render matches the box — the script edits the live config only.Sizing and single point of failure
| Fleets | Concurrent calls | Ingress sizing |
|---|---|---|
| 1 | 16 | n/a (no ingress) |
| 2 | 32 | 4 vCPU / 8 GB single VM |
| 3 | 48 | 4 vCPU / 8 GB single VM |
| 4-5 | 64-80 | 8 vCPU; monitor TLS handshake CPU |
| 6+ | 96+ | second ingress + managed LB or VRRP |