Prompt Caching

Prompt caching is a cost-saving feature offered by some LLM API providers that helps avoid reprocessing the same prompt, improving efficiency and reducing costs for repeated or long-running tasks.

Usage

The use_prompt_caching argument is available for ComputerAgent and agent loops:

agent = ComputerAgent(
    ...,
    use_prompt_caching=True,
)

Type: bool
Default: False
Purpose: Use prompt caching to avoid reprocessing the same prompt.

When using Anthropic-based CUAs (Claude models), setting use_prompt_caching=True will automatically add { "cache_control": "ephemeral" } to your messages. This enables prompt caching for the session and can speed up repeated runs with the same prompt.

Note

This argument is only required for Anthropic CUAs. For other providers, it is ignored.

OpenAI Provider

With the OpenAI provider, prompt caching is handled automatically for prompts of 1000+ tokens. You do not need to set use_prompt_caching—caching will occur for long prompts without any extra configuration.

Example

from agent import ComputerAgent
agent = ComputerAgent(
    model="anthropic/claude-3-5-sonnet-20241022",
    use_prompt_caching=True,
)

Implementation Details

For Anthropic: Adds { "cache_control": "ephemeral" } to messages when enabled.
For OpenAI: Caching is automatic for long prompts; the argument is ignored.

When to Use

Enable for Anthropic CUAs if you want to avoid reprocessing the same prompt in repeated or iterative tasks.
Not needed for OpenAI models unless you want explicit ephemeral cache control (not required for most users).

Usage

Anthropic CUAs

OpenAI Provider

Example

Implementation Details

When to Use

See Also

On this page