For direct-prompt bots on the managed LLM platform, the static system prompt can be served from an explicit provider-side cached-content object so the large static portion is not re-billed on every call. CXB API owns the cache lifecycle (create, refresh, delete, audit); CXB Core consumes the cache name and swaps it if it expires mid-call. This page covers the CXB API side. For how CXB Core injects the cache into the LLM request, see Caching.

Ownership split

ConcernOwner
Create / refresh / delete cached content on the managed LLM platformCXB API (scheduler + /recreate)
Persisting live_prompt_cache_state on the botCXB API
Audit eventsCXB API (live_prompt_cache_events)
Injecting cached_content into the live LLM callCXB Core
Recovering from a mid-call expiryCXB Core calls back into /recreate, then swaps

Scheduled lifecycle

start_scheduler() in the live-prompt-cache lifecycle service registers two APScheduler cron jobs (timezone Asia/Kolkata by default):
JobDefault hourSettingAction
Prewarm07:00live_prompt_cache_prewarm_hourrun_prewarm — create fresh cached content for every enabled bot.
Cleanup23:00live_prompt_cache_cleanup_hourrun_cleanup — delete caches for disabled bots and clear their state.
The cache TTL is live_prompt_cache_ttl_hours (default 25h).
The TTL deliberately exceeds the 24h prewarm interval. A cache stays valid until the next prewarm replaces it, so an enabled bot is continuously covered. A shorter TTL (e.g. 10h) would leave a multi-hour dead window each day where calls run uncached.
Cleanup only targets bots where live_prompt_cache.enabled != true. Enabled bots’ superseded caches are not deleted nightly — they age out by TTL on the managed LLM platform. Deleting an enabled bot’s cache nightly would null a cache a running campaign still needs.

Startup catch-up

On app startup, _catch_up_missed_runs compares the last recorded run (system_runtime doc live_prompt_cache) against the most recent expected cron fire. If a prewarm or cleanup was missed (e.g. the process was down at 07:00), it runs immediately and records a catchup_triggered audit event. Run timestamps are written in a finally block so a crashing job body still advances the timestamp and avoids re-firing on the next boot.

Cross-instance dedup

run_prewarm_for_bot takes a per-bot Redis lock (cxbapi:live_prompt_prewarm:{bot_id}, nx, 300s TTL) so only one CXB API instance prewarms a given bot per window. The scheduled path leaves the lock to expire; the /recreate path releases it on completion (release_lock_on_completion=True) so a follow-up recreate (e.g. after a version_override bump) is not blocked.

Cache state on the bot

Both jobs and /recreate write live_prompt_cache_state on the bot document:
FieldMeaning
cache_nameProvider-side cached-content resource name (null when none).
created_at / expires_atCreation time and TTL expiry.
last_prewarm_at / last_prewarm_statusLast prewarm outcome.
last_cleanup_at / last_cleanup_statusLast cleanup outcome (disabled bots).
The runtime-config builder slims this to {cache_name, expires_at} before sending it to CXB Core as live_prompt_cache_state.

Managed-LLM-platform operations

The live-prompt-cache provider service wraps the managed LLM platform client:
FunctionPurpose
create_cachecaches.create with system_instruction, ttl, optional tools/display_name. Returns the cache name.
delete_cachecaches.delete; returns False on 404 (already gone), True on delete.
extend_ttlcaches.update to push the TTL out (recovery path).
Clients are cached per (project_id, location) in _managed_llm_clients and evicted on auth errors (401/403) so stale credentials get rebuilt. Credentials come from the managed-LLM-platform entry in system settings api_keys (service-account JSON); location defaults to us-east4. A managed-LLM-platform bot with no project_id is skipped with a prewarm_failed event (reason: missing_project_id).

Internal endpoints (CXB Core-only)

Both internal live-cache endpoints require X-CXBCore-Secret.
EndpointBodyBehaviour
POST /api/v1/internal/live-prompt-cache/recreate{ bot_id }Inline single-flight prewarm for one bot. Returns { cache_name }, or 503 recreate_failed / 404 bot not found. Used by CXB Core when a cache expires mid-call.
POST /api/v1/internal/live-prompt-cache/event{ bot_id, event_type, cache_name?, details? }Append-only audit sink so CXB Core can record events like expired_in_call, swap_after_expiry. Unknown event_type400.

Audit log

The live-prompt-cache audit service writes append-only docs to live_prompt_cache_events. record_cache_event validates event_type against an allow-list and never propagates insert failures (audit must not break a call). Allowed event types: created, expired_in_call, swap_after_expiry, invalidated_at_shutdown, extended, recreated_after_expiry, prewarm_succeeded, prewarm_failed, cleanup_succeeded, cleanup_failed, catchup_triggered. The collection has indexes on (bot_id, ts) and (event_type, ts), plus a 90-day TTL index on ts.

Operational checks

SymptomFirst place to check
No cache created for an enabled botlive_prompt_cache.enabled, static prompt non-empty, prewarm_failed events.
Managed-LLM-platform bot never cachesMissing project_id in llm.extra (reason: missing_project_id).
Prewarm skippedRedis lock held by another instance (skipped: lock_held).
Cache deleted while still neededA bot was disabled — cleanup targets disabled bots only.
Mid-call expiry not recoveringCXB Core /recreate call, then swap_after_expiry audit event.

CXB Core caching

How CXB Core injects cached_content and swaps on expiry.

Runtime config

Where live_prompt_cache_state is sent to CXB Core.

Conversation policy

Policy bots (which do not use the live prompt cache path).

Post-call processing

The separate post-call/QC explicit cache.