Quick triage

ProblemFirst checks
Campaign will not startBot exists, number pool not empty, concurrency > 0, time window valid, CSV uploaded.
Campaign running but no callsTime window, CXB Dialler health, fleet capacity, pending/retryable rows.
Many no-answer/rejectedCarrier/SIP status, number quality, time of day, caller ID reputation.
Calls answer but bot does not joinCXB Core attach errors, fleet availability, LiveKit room state.
Bot says wrong customer infoCSV headers, CRM pre-fetch, prompt variables.
Retries not happeningMax attempts reached, retry delay not elapsed, outcome not in retry rules.
Attempt report missing expected dataAttempt still in progress or result webhook not finalized.

Data issues

If the bot speaks curly braces like:
{{CUSTOMERNAME}}
then the variable is missing. Check:
  • CSV header spelling
  • CRM response field spelling
  • prompt variable spelling
  • flat vs namespaced variable usage

Dialler health

Engineering can check on the CXB Dialler host:
curl -fsS http://127.0.0.1:8090/health
uv run python scripts/smoke_check.py
The smoke check validates settings, MongoDB, indexes, health, metrics, fleet reachability, and stale campaign states.

Stuck states

Escalate if many rows remain old in:
  • leased
  • ringing
  • amd_screening
  • attaching
  • dialling
  • in_progress
  • _processing
callback_scheduled and retry_scheduled are normal waiting states, not stuck states — they hold until their due time. Only escalate if a callback_scheduled row stays past its scheduled time while the campaign is running and within its window. These states should move forward or be recovered by stale handling. Persistent buildup means the dialler, LiveKit, CXB Core, or result path needs investigation.

SIP patterns

Repeated SIP status patterns usually point outside bot logic:
PatternCommon meaning
408 / 480No answer or temporarily unavailable.
486 / 603Busy/rejected/declined.
429Rate limited.
5xxCarrier/provider/server issue.
How the dialler maps SIP to a disconnect reason: on an unanswered dial, a SIP code of 408, 480, or an empty/missing code is recorded as no_answer; any other non-answer code is recorded as rejected. So a 486/603 (and most other failure codes) become rejected, while ring-no-answer and timeouts become no_answer.
Do not change the bot prompt to fix SIP failures.