search_knowledge tool. CXB API owns ingestion (extract → chunk → embed → index) and the internal search endpoint; the vectors live in the vector database and the metadata in MongoDB.
The KB pipeline lives in the API service’s knowledge route, knowledge service, and the document, chunking, embedding, and vector-store modules.
Stores
| Store | Collection / object | Holds |
|---|---|---|
| MongoDB | knowledge_bases | KB metadata, counts, status (active/disabled), default_language, tenant_id |
| MongoDB | knowledge_documents | Per-document parse_status, chunk_count, text_char_count, storage_key |
| Vector database | collection cxb_knowledge_chunks (configurable) | Chunk vectors + payload (tenant_id, kb_id, document_id, chunk_id, chunk_text, active, …) |
kb_<hex>, doc_<hex>, and chunk IDs {document_id}_chunk_{i}. Vector-database point IDs are a deterministic UUID5 of the chunk ID, so re-ingestion is idempotent.
Admin endpoints
Under/api/v1/knowledge-bases, all requiring admin:
| Method | Path | Purpose |
|---|---|---|
POST | / | Create a KB |
GET | / | List KBs for the tenant |
GET | /{kb_id} | Get one KB |
PATCH | /{kb_id} | Update name/description/status/language |
DELETE | /{kb_id} | Soft-disable (sets status=disabled) |
POST | /{kb_id}/documents | Upload + ingest a document |
GET | /{kb_id}/documents | List documents |
DELETE | /{kb_id}/documents/{document_id} | Delete document + its vector-database chunks |
Ingestion pipeline
ingest_document runs synchronously within the upload request and records progress on the document:
| Stage | Step | Notes |
|---|---|---|
| Extract | extract_text | Supported: .pdf, .txt, .md, .csv, .docx. PDF via pypdf, DOCX via python-docx (paragraphs + tables). Unsupported types raise KnowledgeDocumentError. |
| Chunk | chunk_text | Normalizes whitespace, then slides a window of chunk_size chars with overlap, preferring a \n/. /space boundary past 50% of the window. Defaults chunk_size_chars=1200, chunk_overlap_chars=180. |
| Embed | embed_texts | The LLM client SDK, model from knowledge.embedding_model (settings default <embedding-model>; the route falls back to a default embedding model if the field is unset), output_dimensionality = embedding_dimensions (768). Requires the LLM provider API key. |
| Index | upsert_chunks | Ensures the collection (cosine distance) + payload indexes on tenant_id/kb_id/document_id/active. |
max_upload_mb (default 20) and returns 413 if exceeded. On any ingestion error the document is marked failed with a truncated parse_error (the upload still returns 200 with that status).
Upload is not deferred to a worker — extraction, embedding, and vector-database upsert all happen inside the request. Large documents therefore make the upload call slow rather than returning a queued status.
Bot attachment
A bot’s KB attachment lives inbot.knowledge (BotKnowledgeConfig in models/knowledge.py):
| Field | Default | Purpose |
|---|---|---|
enabled | false | Master toggle |
kb_ids | [] | Attached KBs (deduped) |
top_k | 4 (1–10) | Max chunks returned |
score_threshold | 0.55 (0–1) | Min cosine score |
strict | true | If true, emit fallback_message when no hit |
trigger_instructions | "" | Natural-language guidance injected into the search_knowledge tool description in CXB Core |
fallback_message | default sentence | Spoken when nothing is found in strict mode |
Search contract (CXB Core)
CXB Core callsPOST /api/v1/internal/knowledge/search, authenticated by X-CXB-Core-Secret. The request carries bot_id, session_id, query, and optional kb_ids/top_k/score_threshold/strict.
search_knowledge enforces and accelerates access:
- Access guard:
validate_bot_kb_access_with_metaintersects the requestedkb_idswith the bot’s attached,activeKBs. Unattached or disabled KBs are silently dropped; no active KB → empty hits. - Redis caching (layered): KB-access (60s), query embeddings (24h), and full results (30m on hit, 2m on no-hit). Result cache keys include a
kb_revisionderived from each KB’supdated_at/counts/status, so editing a KB invalidates cached answers. - Vector-database filter:
tenant_id+kb_id ∈ active+active=true, top-k withscore_threshold. - Response includes
hits[](withscore,source_name,chunk_text) and ametricsblock (embedding/vector-search/cache timings and cache-hit flags). In strict mode with no hits,fallback_messageis returned.
Related docs
CXB Core pipeline
How
search_knowledge is registered and invoked mid-call.Settings
The
knowledge system-settings block (vector-database URL, embedding model, chunk sizes).Tools & integrations
Bot-level tool configuration including knowledge.
CXB API overview
Where the KB pipeline fits in the control plane.