CXB API-owned lifecycle for the live-conversation managed-LLM prompt cache: prewarm, cleanup, TTL, audit, and mid-call recreate.
For direct-prompt bots on the managed LLM platform, the static system prompt can be served from an explicit provider-side cached-content object so the large static portion is not re-billed on every call. CXB API owns the cache lifecycle (create, refresh, delete, audit); CXB Core consumes the cache name and swaps it if it expires mid-call.This page covers the CXB API side. For how CXB Core injects the cache into the LLM request, see Caching.
start_scheduler() in the live-prompt-cache lifecycle service registers two APScheduler cron jobs (timezone Asia/Kolkata by default):
Job
Default hour
Setting
Action
Prewarm
07:00
live_prompt_cache_prewarm_hour
run_prewarm — create fresh cached content for every enabled bot.
Cleanup
23:00
live_prompt_cache_cleanup_hour
run_cleanup — delete caches for disabled bots and clear their state.
The cache TTL is live_prompt_cache_ttl_hours (default 25h).
The TTL deliberately exceeds the 24h prewarm interval. A cache stays valid until the next prewarm replaces it, so an enabled bot is continuously covered. A shorter TTL (e.g. 10h) would leave a multi-hour dead window each day where calls run uncached.
Cleanup only targets bots where live_prompt_cache.enabled != true. Enabled bots’ superseded caches are not deleted nightly — they age out by TTL on the managed LLM platform. Deleting an enabled bot’s cache nightly would null a cache a running campaign still needs.
On app startup, _catch_up_missed_runs compares the last recorded run (system_runtime doc live_prompt_cache) against the most recent expected cron fire. If a prewarm or cleanup was missed (e.g. the process was down at 07:00), it runs immediately and records a catchup_triggered audit event. Run timestamps are written in a finally block so a crashing job body still advances the timestamp and avoids re-firing on the next boot.
run_prewarm_for_bot takes a per-bot Redis lock (cxbapi:live_prompt_prewarm:{bot_id}, nx, 300s TTL) so only one CXB API instance prewarms a given bot per window. The scheduled path leaves the lock to expire; the /recreate path releases it on completion (release_lock_on_completion=True) so a follow-up recreate (e.g. after a version_override bump) is not blocked.
The live-prompt-cache provider service wraps the managed LLM platform client:
Function
Purpose
create_cache
caches.create with system_instruction, ttl, optional tools/display_name. Returns the cache name.
delete_cache
caches.delete; returns False on 404 (already gone), True on delete.
extend_ttl
caches.update to push the TTL out (recovery path).
Clients are cached per (project_id, location) in _managed_llm_clients and evicted on auth errors (401/403) so stale credentials get rebuilt. Credentials come from the managed-LLM-platform entry in system settings api_keys (service-account JSON); location defaults to us-east4. A managed-LLM-platform bot with no project_id is skipped with a prewarm_failed event (reason: missing_project_id).
Both internal live-cache endpoints require X-CXBCore-Secret.
Endpoint
Body
Behaviour
POST /api/v1/internal/live-prompt-cache/recreate
{ bot_id }
Inline single-flight prewarm for one bot. Returns { cache_name }, or 503 recreate_failed / 404 bot not found. Used by CXB Core when a cache expires mid-call.
POST /api/v1/internal/live-prompt-cache/event
{ bot_id, event_type, cache_name?, details? }
Append-only audit sink so CXB Core can record events like expired_in_call, swap_after_expiry. Unknown event_type → 400.
The live-prompt-cache audit service writes append-only docs to live_prompt_cache_events. record_cache_event validates event_type against an allow-list and never propagates insert failures (audit must not break a call).Allowed event types: created, expired_in_call, swap_after_expiry, invalidated_at_shutdown, extended, recreated_after_expiry, prewarm_succeeded, prewarm_failed, cleanup_succeeded, cleanup_failed, catchup_triggered.The collection has indexes on (bot_id, ts) and (event_type, ts), plus a 90-day TTL index on ts.