Prompt Caching

Prompt caching reduces token costs by reusing system prompts across requests:

Anthropic: Explicit cache_control blocks with ephemeral type
Google Gemini: CachedContent API with configurable TTL (5 minutes default)
OpenAI: Automatic prefix caching (no client changes needed, active for prompts >1024 tokens)
xAI: Not currently available; monitored for future support

Caching is always-on and invisible to the user. No settings or configuration required.