A knowledge base (KB) is a tenant-scoped collection of documents that CXB Core can query at call time via the search_knowledge tool. CXB API owns ingestion (extract → chunk → embed → index) and the internal search endpoint; the vectors live in the vector database and the metadata in MongoDB. The KB pipeline lives in the API service’s knowledge route, knowledge service, and the document, chunking, embedding, and vector-store modules.

Stores

StoreCollection / objectHolds
MongoDBknowledge_basesKB metadata, counts, status (active/disabled), default_language, tenant_id
MongoDBknowledge_documentsPer-document parse_status, chunk_count, text_char_count, storage_key
Vector databasecollection cxb_knowledge_chunks (configurable)Chunk vectors + payload (tenant_id, kb_id, document_id, chunk_id, chunk_text, active, …)
KB and document IDs are prefixed: kb_<hex>, doc_<hex>, and chunk IDs {document_id}_chunk_{i}. Vector-database point IDs are a deterministic UUID5 of the chunk ID, so re-ingestion is idempotent.

Admin endpoints

Under /api/v1/knowledge-bases, all requiring admin:
MethodPathPurpose
POST/Create a KB
GET/List KBs for the tenant
GET/{kb_id}Get one KB
PATCH/{kb_id}Update name/description/status/language
DELETE/{kb_id}Soft-disable (sets status=disabled)
POST/{kb_id}/documentsUpload + ingest a document
GET/{kb_id}/documentsList documents
DELETE/{kb_id}/documents/{document_id}Delete document + its vector-database chunks

Ingestion pipeline

ingest_document runs synchronously within the upload request and records progress on the document:
StageStepNotes
Extractextract_textSupported: .pdf, .txt, .md, .csv, .docx. PDF via pypdf, DOCX via python-docx (paragraphs + tables). Unsupported types raise KnowledgeDocumentError.
Chunkchunk_textNormalizes whitespace, then slides a window of chunk_size chars with overlap, preferring a \n/. /space boundary past 50% of the window. Defaults chunk_size_chars=1200, chunk_overlap_chars=180.
Embedembed_textsThe LLM client SDK, model from knowledge.embedding_model (settings default <embedding-model>; the route falls back to a default embedding model if the field is unset), output_dimensionality = embedding_dimensions (768). Requires the LLM provider API key.
Indexupsert_chunksEnsures the collection (cosine distance) + payload indexes on tenant_id/kb_id/document_id/active.
The upload route enforces max_upload_mb (default 20) and returns 413 if exceeded. On any ingestion error the document is marked failed with a truncated parse_error (the upload still returns 200 with that status).
Upload is not deferred to a worker — extraction, embedding, and vector-database upsert all happen inside the request. Large documents therefore make the upload call slow rather than returning a queued status.

Bot attachment

A bot’s KB attachment lives in bot.knowledge (BotKnowledgeConfig in models/knowledge.py):
FieldDefaultPurpose
enabledfalseMaster toggle
kb_ids[]Attached KBs (deduped)
top_k4 (1–10)Max chunks returned
score_threshold0.55 (0–1)Min cosine score
stricttrueIf true, emit fallback_message when no hit
trigger_instructions""Natural-language guidance injected into the search_knowledge tool description in CXB Core
fallback_messagedefault sentenceSpoken when nothing is found in strict mode

Search contract (CXB Core)

CXB Core calls POST /api/v1/internal/knowledge/search, authenticated by X-CXB-Core-Secret. The request carries bot_id, session_id, query, and optional kb_ids/top_k/score_threshold/strict. search_knowledge enforces and accelerates access:
  • Access guard: validate_bot_kb_access_with_meta intersects the requested kb_ids with the bot’s attached, active KBs. Unattached or disabled KBs are silently dropped; no active KB → empty hits.
  • Redis caching (layered): KB-access (60s), query embeddings (24h), and full results (30m on hit, 2m on no-hit). Result cache keys include a kb_revision derived from each KB’s updated_at/counts/status, so editing a KB invalidates cached answers.
  • Vector-database filter: tenant_id + kb_id ∈ active + active=true, top-k with score_threshold.
  • Response includes hits[] (with score, source_name, chunk_text) and a metrics block (embedding/vector-search/cache timings and cache-hit flags). In strict mode with no hits, fallback_message is returned.

CXB Core pipeline

How search_knowledge is registered and invoked mid-call.

Settings

The knowledge system-settings block (vector-database URL, embedding model, chunk sizes).

Tools & integrations

Bot-level tool configuration including knowledge.

CXB API overview

Where the KB pipeline fits in the control plane.