Current rating
| Area | Rating | Notes |
|---|---|---|
| Engineering architecture | 7.5 / 10 | Strong plane separation and practical runtime isolation. Needs better large-scale routing and observability. |
| Communication quality engineering | 8 / 10 | Strong listening, turn-taking, interruption controls, silence handling, tool timing, and telemetry. Needs more automated audio-quality regression checks. |
| Production readiness | 7 / 10 | Good for controlled deployments. Needs containerization, centralized metrics, and safer scale primitives. |
| Documentation quality | 7 / 10 | Institutional memory is now consolidated: this engineering handbook, an Operations tab, and per-service overviews (CXB Core/CXB API/CXB Console/CXB Dialler) exist. The earlier “scattered, not ops-friendly” rating is historical. Remaining gap is keeping per-client deployment inventory current. |
| 1K-channel readiness | 4 / 10 | Current worker model can scale mechanically, but fleet registry, dialler partitioning, and observability need work. |
Strengths
- Clear split between runtime, control plane, UI, and dialler.
- One-call-per-worker isolation is simple and reliable.
- Shared post-call logic across transports.
- Durable outboxes for result delivery and Agent Desk enqueue.
- Strong communication-quality primitives: VAD, turn detection, STT-native turn events, backchannel filtering, re-engagement, tool timing policies, and latency telemetry.
- Internal secret boundary between CXB API and CXB Core.
- CRM API key model for external dialout.
- Knowledge Base RAG, tool telemetry, auto dispositions, and recording tokenization are product-grade primitives.
- Shipped cost/quality features: live prompt caching, version-based post-call caching, TTS caching, campaign callback scheduling, and Agent Desk human escalation.
- Meaningful test coverage across Python services and selected CXB Console logic. CXB Core alone collected 540 tests as of 2026-05-19; all must pass before deploying.
Weaknesses
- Top-level READMEs in service repos are too thin.
- Some production knowledge still lives only in each service’s source-of-truth dev guide, old plans, or memory rather than in shared docs.
- Full CXB Console lint has known legacy issues.
- Observability is not yet a complete operations surface.
- No automated audio-quality regression suite yet; most subjective call quality still depends on human test calls and recordings.
- Fleet routing is static URL-list based.
- CXB Dialler is still a single main loop process.
- Current deployment process is manual and host-specific.
- Mutable Pydantic defaults exist in some models; Pydantic handles them, but
default_factorywould be cleaner.
Scale risks
| Risk | Impact | Direction |
|---|---|---|
| Static fleet URL polling | Becomes expensive and brittle with many hosts. | Central capacity registry. |
| One CXB Dialler loop | Limits campaign throughput and failover. | Partitioned workers with leases. |
| Provider rate limits | STT/TTS/LLM can bottleneck before compute. | Per-provider quota tracking and fallback plans. |
| Recording storage drift | Playback can fail if storage differs by host. | Shared object storage per deployment. |
| Weak alerting | Failures discovered late. | Metrics, alerts, and runbooks. |
What “good” looks like next
- A new engineer can trace a call from carrier ingress to CRM push using docs alone.
- Ops can create a bot and campaign without engineering help for normal cases.
- Docker Compose can run the stack locally.
- Production deploys use versioned images.
- Fleet workers register capacity centrally.
- Dialler can be restarted or scaled without losing campaign correctness.