Anthropic cache allows you to cache parts of a prompt. As explained in their documentation, you need to opt-in within the prompt to start caching parts of it.To do this, add cacheControl: true to the front matter of your prompt.
Once this is set up, you can start caching specific parts of the prompt:
Copy
Ask AI
<system>This part of the text is not cached</system><system> Read this large book and answer users' questions. <text cache_control={{ { type: 'ephemeral' } }}> ...BIG_BOOK_CONTENT... </text></system>
If you want an entire message to be cached, add the cache directive to the user, assistant, or system tags:
Copy
Ask AI
<user cache_control={{ { type: 'ephemeral' } }}> This text will be cached. <text> This text will also be cached. </text> And this text as well.</user>
The latest Anthropic models have a configurable thinking budget that allows you to control the amount of time the model spends reasoning to generate a response.