Zum Hauptinhalt springen

Anthropic

LiteLLM unterstützt alle Anthropic-Modelle.

  • claude-3.5 (claude-3-5-sonnet-20240620)
  • claude-3 (claude-3-haiku-20240307, claude-3-opus-20240229, claude-3-sonnet-20240229)
  • claude-2
  • claude-2.1
  • claude-instant-1.2
EigenschaftDetails
BeschreibungClaude ist eine hochleistungsfähige, vertrauenswürdige und intelligente KI-Plattform von Anthropic. Claude zeichnet sich bei Aufgaben aus, die Sprache, Logik, Analyse, Codierung und mehr beinhalten.
Provider-Routing in LiteLLManthropic/ (fügen Sie dieses Präfix dem Modellnamen hinzu, um Anfragen an Anthropic zu leiten - z.B. anthropic/claude-3-5-sonnet-20240620)
Provider-DokumentationAnthropic ↗
API-Endpunkt für Anbieterhttps://api.anthropic.com
Unterstützte Endpunkte/chat/completions

Unterstützte OpenAI-Parameter

Überprüfen Sie dies im Code, hier

"stream",
"stop",
"temperature",
"top_p",
"max_tokens",
"max_completion_tokens",
"tools",
"tool_choice",
"extra_headers",
"parallel_tool_calls",
"response_format",
"user"
Info

Die Anthropic API lehnt Anfragen ab, wenn max_tokens nicht übergeben wird. Aus diesem Grund übergibt litellm max_tokens=4096, wenn keine max_tokens übergeben werden.

API-Schlüssel

import os

os.environ["ANTHROPIC_API_KEY"] = "your-api-key"
# os.environ["ANTHROPIC_API_BASE"] = "" # [OPTIONAL] or 'ANTHROPIC_BASE_URL'

Verwendung

import os
from litellm import completion

# set env - [OPTIONAL] replace with your anthropic key
os.environ["ANTHROPIC_API_KEY"] = "your-api-key"

messages = [{"role": "user", "content": "Hey! how's it going?"}]
response = completion(model="claude-3-opus-20240229", messages=messages)
print(response)

Verwendung - Streaming

Setzen Sie einfach stream=True, wenn Sie Completion aufrufen.

import os
from litellm import completion

# set env
os.environ["ANTHROPIC_API_KEY"] = "your-api-key"

messages = [{"role": "user", "content": "Hey! how's it going?"}]
response = completion(model="claude-3-opus-20240229", messages=messages, stream=True)
for chunk in response:
print(chunk["choices"][0]["delta"]["content"]) # same as openai format

Verwendung mit LiteLLM Proxy

So rufen Sie Anthropic mit dem LiteLLM Proxy Server auf

1. Speichern Sie den Schlüssel in Ihrer Umgebung

export ANTHROPIC_API_KEY="your-api-key"

2. Proxy starten

model_list:
- model_name: claude-3 ### RECEIVED MODEL NAME ###
litellm_params: # all params accepted by litellm.completion() - https://docs.litellm.de/docs/completion/input
model: claude-3-opus-20240229 ### MODEL NAME sent to `litellm.completion()` ###
api_key: "os.environ/ANTHROPIC_API_KEY" # does os.getenv("AZURE_API_KEY_EU")
litellm --config /path/to/config.yaml

3. Testen

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "claude-3",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
]
}
'

Unterstützte Modelle

Modellname 👉 Menschfreundlicher Name.
Funktionsaufruf 👉 Wie das Modell in LiteLLM aufgerufen wird.

ModellnameFunktionsaufruf
claude-3-5-sonnetcompletion('claude-3-5-sonnet-20240620', messages)
claude-3-haikucompletion('claude-3-haiku-20240307', messages)
claude-3-opuscompletion('claude-3-opus-20240229', messages)
claude-3-5-sonnet-20240620completion('claude-3-5-sonnet-20240620', messages)
claude-3-sonnetcompletion('claude-3-sonnet-20240229', messages)
claude-2.1completion('claude-2.1', messages)
claude-2completion('claude-2', messages)
claude-instant-1.2completion('claude-instant-1.2', messages)
claude-instant-1completion('claude-instant-1', messages)

Prompt-Caching

Verwenden Sie Anthropic Prompt Caching

Relevante Anthropic API Docs

Hinweis

Hier sehen Sie eine Beispiel-Rohdatenanfrage von LiteLLM für Anthropic Context Caching

POST Request Sent from LiteLLM:
curl -X POST \
https://api.anthropic.com/v1/messages \
-H 'accept: application/json' -H 'anthropic-version: 2023-06-01' -H 'content-type: application/json' -H 'x-api-key: sk-...' -H 'anthropic-beta: prompt-caching-2024-07-31' \
-d '{'model': 'claude-3-5-sonnet-20240620', [
{
"role": "user",
"content": [
{
"type": "text",
"text": "What are the key terms and conditions in this agreement?",
"cache_control": {
"type": "ephemeral"
}
}
]
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "Certainly! The key terms and conditions are the following: the contract is 1 year long for $10/mo"
}
]
}
],
"temperature": 0.2,
"max_tokens": 10
}'

Caching - Large Context Caching

Dieses Beispiel demonstriert die grundlegende Verwendung von Prompt Caching, wobei der vollständige Text der rechtlichen Vereinbarung als Präfix zwischengespeichert wird, während die Benutzeranweisung nicht zwischengespeichert wird.

response = await litellm.acompletion(
model="anthropic/claude-3-5-sonnet-20240620",
messages=[
{
"role": "system",
"content": [
{
"type": "text",
"text": "You are an AI assistant tasked with analyzing legal documents.",
},
{
"type": "text",
"text": "Here is the full text of a complex legal agreement",
"cache_control": {"type": "ephemeral"},
},
],
},
{
"role": "user",
"content": "what are the key terms and conditions in this agreement?",
},
]
)

Caching - Tool-Definitionen

In diesem Beispiel demonstrieren wir das Caching von Tool-Definitionen.

Der Parameter cache_control wird auf das endgültige Tool gesetzt

import litellm

response = await litellm.acompletion(
model="anthropic/claude-3-5-sonnet-20240620",
messages = [{"role": "user", "content": "What's the weather like in Boston today?"}]
tools = [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA",
},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
},
"required": ["location"],
},
"cache_control": {"type": "ephemeral"}
},
}
]
)

Caching - Fortsetzung von Multi-Turn-Konversationen

In diesem Beispiel demonstrieren wir, wie Prompt Caching in einer Multi-Turn-Konversation verwendet wird.

Der Parameter cache_control wird auf die Systemnachricht gesetzt, um sie als Teil des statischen Präfixes zu kennzeichnen.

Der Gesprächsverlauf (vorherige Nachrichten) ist im Array messages enthalten. Die letzte Runde ist mit cache-control gekennzeichnet, um bei Folgeanfragen weiterzunutzen. Die vorletzte Benutzernachricht ist mit dem Parameter cache_control zum Caching gekennzeichnet, damit dieser Kontrollpunkt aus dem vorherigen Cache lesen kann.

import litellm

response = await litellm.acompletion(
model="anthropic/claude-3-5-sonnet-20240620",
messages=[
# System Message
{
"role": "system",
"content": [
{
"type": "text",
"text": "Here is the full text of a complex legal agreement"
* 400,
"cache_control": {"type": "ephemeral"},
}
],
},
# marked for caching with the cache_control parameter, so that this checkpoint can read from the previous cache.
{
"role": "user",
"content": [
{
"type": "text",
"text": "What are the key terms and conditions in this agreement?",
"cache_control": {"type": "ephemeral"},
}
],
},
{
"role": "assistant",
"content": "Certainly! the key terms and conditions are the following: the contract is 1 year long for $10/mo",
},
# The final turn is marked with cache-control, for continuing in followups.
{
"role": "user",
"content": [
{
"type": "text",
"text": "What are the key terms and conditions in this agreement?",
"cache_control": {"type": "ephemeral"},
}
],
},
]
)

Funktions-/Tool-Aufrufe

Info

LiteLLM verwendet jetzt Anthropic's 'tool'-Parameter 🎉 (v1.34.29+)

from litellm import completion

# set env
os.environ["ANTHROPIC_API_KEY"] = "your-api-key"

tools = [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA",
},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
},
"required": ["location"],
},
},
}
]
messages = [{"role": "user", "content": "What's the weather like in Boston today?"}]

response = completion(
model="anthropic/claude-3-opus-20240229",
messages=messages,
tools=tools,
tool_choice="auto",
)
# Add any assertions, here to check response args
print(response)
assert isinstance(response.choices[0].message.tool_calls[0].function.name, str)
assert isinstance(
response.choices[0].message.tool_calls[0].function.arguments, str
)

Erzwingen der Anthropic-Tool-Nutzung

Wenn Sie möchten, dass Claude ein bestimmtes Tool zur Beantwortung der Benutzerfrage verwendet

Sie können dies tun, indem Sie das Tool im Feld tool_choice wie folgt angeben

response = completion(
model="anthropic/claude-3-opus-20240229",
messages=messages,
tools=tools,
tool_choice={"type": "tool", "name": "get_weather"},
)

Parallele Funktionsaufrufe

So übergeben Sie das Ergebnis eines Funktionsaufrufs an ein Anthropic-Modell

from litellm import completion
import os

os.environ["ANTHROPIC_API_KEY"] = "sk-ant.."


litellm.set_verbose = True

### 1ST FUNCTION CALL ###
tools = [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA",
},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
},
"required": ["location"],
},
},
}
]
messages = [
{
"role": "user",
"content": "What's the weather like in Boston today in Fahrenheit?",
}
]
try:
# test without max tokens
response = completion(
model="anthropic/claude-3-opus-20240229",
messages=messages,
tools=tools,
tool_choice="auto",
)
# Add any assertions, here to check response args
print(response)
assert isinstance(response.choices[0].message.tool_calls[0].function.name, str)
assert isinstance(
response.choices[0].message.tool_calls[0].function.arguments, str
)

messages.append(
response.choices[0].message.model_dump()
) # Add assistant tool invokes
tool_result = (
'{"location": "Boston", "temperature": "72", "unit": "fahrenheit"}'
)
# Add user submitted tool results in the OpenAI format
messages.append(
{
"tool_call_id": response.choices[0].message.tool_calls[0].id,
"role": "tool",
"name": response.choices[0].message.tool_calls[0].function.name,
"content": tool_result,
}
)
### 2ND FUNCTION CALL ###
# In the second response, Claude should deduce answer from tool results
second_response = completion(
model="anthropic/claude-3-opus-20240229",
messages=messages,
tools=tools,
tool_choice="auto",
)
print(second_response)
except Exception as e:
print(f"An error occurred - {str(e)}")

s/o @Shekhar Patnaik für die Anforderung dieses Features!

from litellm import completion

tools = [
{
"type": "computer_20241022",
"function": {
"name": "computer",
"parameters": {
"display_height_px": 100,
"display_width_px": 100,
"display_number": 1,
},
},
}
]
model = "claude-3-5-sonnet-20241022"
messages = [{"role": "user", "content": "Save a picture of a cat to my desktop."}]

resp = completion(
model=model,
messages=messages,
tools=tools,
# headers={"anthropic-beta": "computer-use-2024-10-22"},
)

print(resp)

Verwendung - Vision

from litellm import completion

# set env
os.environ["ANTHROPIC_API_KEY"] = "your-api-key"

def encode_image(image_path):
import base64

with open(image_path, "rb") as image_file:
return base64.b64encode(image_file.read()).decode("utf-8")


image_path = "../proxy/cached_logo.jpg"
# Getting the base64 string
base64_image = encode_image(image_path)
resp = litellm.completion(
model="anthropic/claude-3-opus-20240229",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "Whats in this image?"},
{
"type": "image_url",
"image_url": {
"url": "data:image/jpeg;base64," + base64_image
},
},
],
}
],
)
print(f"\nResponse: {resp}")

Verwendung - Denken / reasoning_content

LiteLLM übersetzt reasoning_effort von OpenAI in den thinking-Parameter von Anthropic. Code

reasoning_effortDenken
"low""budget_tokens": 1024
"medium""budget_tokens": 2048
"high""budget_tokens": 4096
from litellm import completion

resp = completion(
model="anthropic/claude-3-7-sonnet-20250219",
messages=[{"role": "user", "content": "What is the capital of France?"}],
reasoning_effort="low",
)

Erwartete Antwort

ModelResponse(
id='chatcmpl-c542d76d-f675-4e87-8e5f-05855f5d0f5e',
created=1740470510,
model='claude-3-7-sonnet-20250219',
object='chat.completion',
system_fingerprint=None,
choices=[
Choices(
finish_reason='stop',
index=0,
message=Message(
content="The capital of France is Paris.",
role='assistant',
tool_calls=None,
function_call=None,
provider_specific_fields={
'citations': None,
'thinking_blocks': [
{
'type': 'thinking',
'thinking': 'The capital of France is Paris. This is a very straightforward factual question.',
'signature': 'EuYBCkQYAiJAy6...'
}
]
}
),
thinking_blocks=[
{
'type': 'thinking',
'thinking': 'The capital of France is Paris. This is a very straightforward factual question.',
'signature': 'EuYBCkQYAiJAy6AGB...'
}
],
reasoning_content='The capital of France is Paris. This is a very straightforward factual question.'
)
],
usage=Usage(
completion_tokens=68,
prompt_tokens=42,
total_tokens=110,
completion_tokens_details=None,
prompt_tokens_details=PromptTokensDetailsWrapper(
audio_tokens=None,
cached_tokens=0,
text_tokens=None,
image_tokens=None
),
cache_creation_input_tokens=0,
cache_read_input_tokens=0
)
)

Übergib thinking an Anthropic-Modelle

Sie können den thinking-Parameter auch an Anthropic-Modelle übergeben.

Sie können den thinking-Parameter auch an Anthropic-Modelle übergeben.

response = litellm.completion(
model="anthropic/claude-3-7-sonnet-20250219",
messages=[{"role": "user", "content": "What is the capital of France?"}],
thinking={"type": "enabled", "budget_tokens": 1024},
)

Übergabe zusätzlicher Header an die Anthropic API

Übergeben Sie extra_headers: dict an litellm.completion

from litellm import completion
messages = [{"role": "user", "content": "What is Anthropic?"}]
response = completion(
model="claude-3-5-sonnet-20240620",
messages=messages,
extra_headers={"anthropic-beta": "max-tokens-3-5-sonnet-2024-07-15"}
)

Verwendung - "Assistant Pre-fill"

Sie können "Claude Worte in den Mund legen", indem Sie eine Nachricht mit der Rolle assistant als letztes Element im Array messages aufnehmen.

[!WICHTIG]Die zurückgegebene Completion enthält *nicht* Ihren "Pre-fill"-Text, da dieser Teil des Prompts selbst ist. Stellen Sie sicher, dass Sie Claudes Completion mit Ihrem Pre-fill voranstellen.

import os
from litellm import completion

# set env - [OPTIONAL] replace with your anthropic key
os.environ["ANTHROPIC_API_KEY"] = "your-api-key"

messages = [
{"role": "user", "content": "How do you say 'Hello' in German? Return your answer as a JSON object, like this:\n\n{ \"Hello\": \"Hallo\" }"},
{"role": "assistant", "content": "{"},
]
response = completion(model="claude-2.1", messages=messages)
print(response)

Beispiel-Prompt, der an Claude gesendet wird


Human: How do you say 'Hello' in German? Return your answer as a JSON object, like this:

{ "Hello": "Hallo" }

Assistant: {

Verwendung - "System"-Nachrichten

Wenn Sie Anthropic's Claude 2.1 verwenden, werden system-Rollen-Nachrichten für Sie korrekt formatiert.

import os
from litellm import completion

# set env - [OPTIONAL] replace with your anthropic key
os.environ["ANTHROPIC_API_KEY"] = "your-api-key"

messages = [
{"role": "system", "content": "You are a snarky assistant."},
{"role": "user", "content": "How do I boil water?"},
]
response = completion(model="claude-2.1", messages=messages)

Beispiel-Prompt, der an Claude gesendet wird

You are a snarky assistant.

Human: How do I boil water?

Assistant:

Verwendung - PDF

Übergeben Sie base64-kodierte PDF-Dateien an Anthropic-Modelle über das Feld image_url.

Verwendung von Base64

from litellm import completion, supports_pdf_input
import base64
import requests

# URL of the file
url = "https://storage.googleapis.com/cloud-samples-data/generative-ai/pdf/2403.05530.pdf"

# Download the file
response = requests.get(url)
file_data = response.content

encoded_file = base64.b64encode(file_data).decode("utf-8")

## check if model supports pdf input - (2024/11/11) only claude-3-5-haiku-20241022 supports it
supports_pdf_input("anthropic/claude-3-5-haiku-20241022") # True

response = completion(
model="anthropic/claude-3-5-haiku-20241022",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "You are a very professional document summarization specialist. Please summarize the given document."},
{
"type": "file",
"file": {
"file_data": f"data:application/pdf;base64,{encoded_file}", # 👈 PDF
}
},
],
}
],
max_tokens=300,
)

print(response.choices[0])

[BETA]Citations API

Übergeben Sie citations: {"enabled": true} an Anthropic, um Zitate in Ihren Dokumentenantworten zu erhalten.

Hinweis: Diese Schnittstelle ist im BETA-Stadium. Wenn Sie Feedback haben, wie Zitate zurückgegeben werden sollen, teilen Sie es uns hier mit

from litellm import completion

resp = completion(
model="claude-3-5-sonnet-20241022",
messages=[
{
"role": "user",
"content": [
{
"type": "document",
"source": {
"type": "text",
"media_type": "text/plain",
"data": "The grass is green. The sky is blue.",
},
"title": "My Document",
"context": "This is a trustworthy document.",
"citations": {"enabled": True},
},
{
"type": "text",
"text": "What color is the grass and sky?",
},
],
}
],
)

citations = resp.choices[0].message.provider_specific_fields["citations"]

assert citations is not None

Verwendung - Übergabe von 'user_id' an Anthropic

LiteLLM übersetzt den user-Parameter von OpenAI in den metadata[user_id]-Parameter von Anthropic.

response = completion(
model="claude-3-5-sonnet-20240620",
messages=messages,
user="user_123",
)