Health checks - CX Bridge

CXB Core exposes two health endpoints: one per-worker and one fleet-wide aggregated view.

Per-worker health

GET /health

Returns the calling status of the individual worker that handles the request.

{
  "status": "ok",
  "worker_calls": 0,
  "worker_max": 1,
  "total_capacity": 16,
  "total_available": 16,
  "workers": 16,
  "worker_id": "3",
  "worker_pid": 12345
}

Field	Type	Description
`status`	string	Always `"ok"` if the worker is responding
`worker_calls`	int	Active calls on this worker (0 or 1)
`worker_max`	int	`MAX_CONCURRENT_CALLS` setting (always 1)
`total_capacity`	int	`worker_max × workers` (fleet total)
`total_available`	int	`total_capacity - worker_calls` (estimated)
`workers`	int	`CXBCORE_WORKERS` setting
`worker_id`	string	Worker number (derived from socket path)
`worker_pid`	int	OS process ID

When called through nginx, total_available is an estimate from this worker’s perspective. Use /health/fleet for accurate totals.

Fleet health

GET /health/fleet

Queries all workers on the local server over their Unix sockets and aggregates their status.

{
  "status": "ok",
  "fleet_calls": 3,
  "fleet_max": 16,
  "fleet_available": 13,
  "workers": [
    {"id": "1", "calls": 1, "max": 1, "pid": 12345},
    {"id": "2", "calls": 0, "max": 1, "pid": 12346},
    {"id": "3", "calls": 1, "max": 1, "pid": 12347},
    {"id": "4", "calls": 0, "max": 1, "pid": 12348},
    ...
  ]
}

Field	Type	Description
`fleet_calls`	int	Total active calls across all workers
`fleet_max`	int	Total capacity across all workers
`fleet_available`	int	Available call slots
`workers`	list	Per-worker breakdown

If a worker is unreachable, its entry shows {"id": "N", "status": "unreachable"}.

The fleet_available value is parsed by the HAProxy WSS ingress capacity health check, which matches it against the PCRE2 pattern "fleet_available":([1-9]|1[0-6])[,}] (bounded to 1-16). Two coupling hazards:

Changing the /health/fleet JSON shape (renaming the field, reordering keys, or altering how the number is serialized) can silently break the ingress check, marking healthy fleets DOWN.
Scaling a fleet past 16 workers makes fleet_available exceed 16, the pattern stops matching, and the higher-capacity fleet is marked DOWN — during the exact “add capacity” operation. When you raise maxconn past 16, raise this upper bound in the HAProxy template and README in the same change, and re-validate against a captured body.

Monitoring commands

Fleet-wide view (from anywhere):

curl -s https://fleet.example.com/health/fleet | python3 -m json.tool

Quick status check:

curl -s https://fleet.example.com/health

​Per-worker health

​Fleet health

​Monitoring commands

Per-worker health

Fleet health

Monitoring commands