Anthropic

LiteLLM unterstützt alle Anthropic-Modelle.

claude-3.5 (claude-3-5-sonnet-20240620)
claude-3 (claude-3-haiku-20240307, claude-3-opus-20240229, claude-3-sonnet-20240229)
claude-2
claude-2.1
claude-instant-1.2

Eigenschaft	Details
Beschreibung	Claude ist eine hochleistungsfähige, vertrauenswürdige und intelligente KI-Plattform von Anthropic. Claude zeichnet sich bei Aufgaben aus, die Sprache, Logik, Analyse, Codierung und mehr beinhalten.
Provider-Routing in LiteLLM	`anthropic/` (fügen Sie dieses Präfix dem Modellnamen hinzu, um Anfragen an Anthropic zu leiten - z.B. `anthropic/claude-3-5-sonnet-20240620`)
Provider-Dokumentation	Anthropic ↗
API-Endpunkt für Anbieter	https://api.anthropic.com
Unterstützte Endpunkte	`/chat/completions`

Unterstützte OpenAI-Parameter

Überprüfen Sie dies im Code, hier

"stream",
"stop",
"temperature",
"top_p",
"max_tokens",
"max_completion_tokens",
"tools",
"tool_choice",
"extra_headers",
"parallel_tool_calls",
"response_format",
"user"

Info

Die Anthropic API lehnt Anfragen ab, wenn max_tokens nicht übergeben wird. Aus diesem Grund übergibt litellm max_tokens=4096, wenn keine max_tokens übergeben werden.

API-Schlüssel

import os

os.environ["ANTHROPIC_API_KEY"] = "your-api-key"
# os.environ["ANTHROPIC_API_BASE"] = "" # [OPTIONAL] or 'ANTHROPIC_BASE_URL'

Verwendung

import os
from litellm import completion

# set env - [OPTIONAL] replace with your anthropic key
os.environ["ANTHROPIC_API_KEY"] = "your-api-key"

messages = [{"role": "user", "content": "Hey! how's it going?"}]
response = completion(model="claude-3-opus-20240229", messages=messages)
print(response)

Verwendung - Streaming

Setzen Sie einfach stream=True, wenn Sie Completion aufrufen.

import os
from litellm import completion

# set env
os.environ["ANTHROPIC_API_KEY"] = "your-api-key"

messages = [{"role": "user", "content": "Hey! how's it going?"}]
response = completion(model="claude-3-opus-20240229", messages=messages, stream=True)
for chunk in response:
    print(chunk["choices"][0]["delta"]["content"])  # same as openai format

Verwendung mit LiteLLM Proxy

So rufen Sie Anthropic mit dem LiteLLM Proxy Server auf

1. Speichern Sie den Schlüssel in Ihrer Umgebung

export ANTHROPIC_API_KEY="your-api-key"

2. Proxy starten

config.yaml
Konfiguration - Standardmäßig alle Anthropic-Modelle
CLI

model_list:
  - model_name: claude-3 ### RECEIVED MODEL NAME ###
    litellm_params: # all params accepted by litellm.completion() - https://docs.litellm.de/docs/completion/input
      model: claude-3-opus-20240229 ### MODEL NAME sent to `litellm.completion()` ###
      api_key: "os.environ/ANTHROPIC_API_KEY" # does os.getenv("AZURE_API_KEY_EU")

litellm --config /path/to/config.yaml

Verwenden Sie dies, wenn Sie Anfragen an claude-3-haiku-20240307,claude-3-opus-20240229,claude-2.1 senden möchten, ohne sie in der config.yaml zu definieren

Erforderliche Umgebungsvariablen

ANTHROPIC_API_KEY=sk-ant****

model_list:
  - model_name: "*" 
    litellm_params:
      model: "*"

litellm --config /path/to/config.yaml

Beispielanfrage für diese config.yaml

Stellen Sie sicher, dass Sie das Präfix anthropic/ verwenden, um die Anfrage an die Anthropic API zu leiten

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
      "model": "anthropic/claude-3-haiku-20240307",
      "messages": [
        {
          "role": "user",
          "content": "what llm are you"
        }
      ]
    }
'

$ litellm --model claude-3-opus-20240229

# Server running on http://0.0.0.0:4000

3. Testen

Curl-Anfrage
OpenAI v1.0.0+
Langchain

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
      "model": "claude-3",
      "messages": [
        {
          "role": "user",
          "content": "what llm are you"
        }
      ]
    }
'

import openai
client = openai.OpenAI(
    api_key="anything",
    base_url="http://0.0.0.0:4000"
)

# request sent to model set on litellm proxy, `litellm --model`
response = client.chat.completions.create(model="claude-3", messages = [
    {
        "role": "user",
        "content": "this is a test request, write a short poem"
    }
])

print(response)

from langchain.chat_models import ChatOpenAI
from langchain.prompts.chat import (
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
    SystemMessagePromptTemplate,
)
from langchain.schema import HumanMessage, SystemMessage

chat = ChatOpenAI(
    openai_api_base="http://0.0.0.0:4000", # set openai_api_base to the LiteLLM Proxy
    model = "claude-3",
    temperature=0.1
)

messages = [
    SystemMessage(
        content="You are a helpful assistant that im using to make a test request to."
    ),
    HumanMessage(
        content="test from litellm. tell me why it's amazing in 1 sentence"
    ),
]
response = chat(messages)

print(response)

Unterstützte Modelle

Modellname 👉 Menschfreundlicher Name.
Funktionsaufruf 👉 Wie das Modell in LiteLLM aufgerufen wird.

Modellname	Funktionsaufruf
claude-3-5-sonnet	`completion('claude-3-5-sonnet-20240620', messages)`
claude-3-haiku	`completion('claude-3-haiku-20240307', messages)`
claude-3-opus	`completion('claude-3-opus-20240229', messages)`
claude-3-5-sonnet-20240620	`completion('claude-3-5-sonnet-20240620', messages)`
claude-3-sonnet	`completion('claude-3-sonnet-20240229', messages)`
claude-2.1	`completion('claude-2.1', messages)`
claude-2	`completion('claude-2', messages)`
claude-instant-1.2	`completion('claude-instant-1.2', messages)`
claude-instant-1	`completion('claude-instant-1', messages)`

Prompt-Caching

Verwenden Sie Anthropic Prompt Caching

Relevante Anthropic API Docs

Hinweis

Hier sehen Sie eine Beispiel-Rohdatenanfrage von LiteLLM für Anthropic Context Caching

POST Request Sent from LiteLLM:
curl -X POST \
https://api.anthropic.com/v1/messages \
-H 'accept: application/json' -H 'anthropic-version: 2023-06-01' -H 'content-type: application/json' -H 'x-api-key: sk-...' -H 'anthropic-beta: prompt-caching-2024-07-31' \
-d '{'model': 'claude-3-5-sonnet-20240620', [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "What are the key terms and conditions in this agreement?",
          "cache_control": {
            "type": "ephemeral"
          }
        }
      ]
    },
    {
      "role": "assistant",
      "content": [
        {
          "type": "text",
          "text": "Certainly! The key terms and conditions are the following: the contract is 1 year long for $10/mo"
        }
      ]
    }
  ],
  "temperature": 0.2,
  "max_tokens": 10
}'

Caching - Large Context Caching

Dieses Beispiel demonstriert die grundlegende Verwendung von Prompt Caching, wobei der vollständige Text der rechtlichen Vereinbarung als Präfix zwischengespeichert wird, während die Benutzeranweisung nicht zwischengespeichert wird.

LiteLLM SDK
LiteLLM Proxy

response = await litellm.acompletion(
    model="anthropic/claude-3-5-sonnet-20240620",
    messages=[
        {
            "role": "system",
            "content": [
                {
                    "type": "text",
                    "text": "You are an AI assistant tasked with analyzing legal documents.",
                },
                {
                    "type": "text",
                    "text": "Here is the full text of a complex legal agreement",
                    "cache_control": {"type": "ephemeral"},
                },
            ],
        },
        {
            "role": "user",
            "content": "what are the key terms and conditions in this agreement?",
        },
    ]
)

Info

LiteLLM Proxy ist OpenAI-kompatibel

Dies ist ein Beispiel, das das OpenAI Python SDK verwendet, um eine Anfrage an den LiteLLM Proxy zu senden

Angenommen, Sie haben ein Modell=anthropic/claude-3-5-sonnet-20240620 in der litellm proxy config.yaml

import openai
client = openai.AsyncOpenAI(
    api_key="anything",            # litellm proxy api key
    base_url="http://0.0.0.0:4000" # litellm proxy base url
)


response = await client.chat.completions.create(
    model="anthropic/claude-3-5-sonnet-20240620",
    messages=[
        {
            "role": "system",
            "content": [
                {
                    "type": "text",
                    "text": "You are an AI assistant tasked with analyzing legal documents.",
                },
                {
                    "type": "text",
                    "text": "Here is the full text of a complex legal agreement",
                    "cache_control": {"type": "ephemeral"},
                },
            ],
        },
        {
            "role": "user",
            "content": "what are the key terms and conditions in this agreement?",
        },
    ]
)

Caching - Tool-Definitionen

In diesem Beispiel demonstrieren wir das Caching von Tool-Definitionen.

Der Parameter cache_control wird auf das endgültige Tool gesetzt

LiteLLM SDK
LiteLLM Proxy

import litellm

response = await litellm.acompletion(
    model="anthropic/claude-3-5-sonnet-20240620",
    messages = [{"role": "user", "content": "What's the weather like in Boston today?"}]
    tools = [
        {
            "type": "function",
            "function": {
                "name": "get_current_weather",
                "description": "Get the current weather in a given location",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "The city and state, e.g. San Francisco, CA",
                        },
                        "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
                    },
                    "required": ["location"],
                },
                "cache_control": {"type": "ephemeral"}
            },
        }
    ]
)

Info

LiteLLM Proxy ist OpenAI-kompatibel

Dies ist ein Beispiel, das das OpenAI Python SDK verwendet, um eine Anfrage an den LiteLLM Proxy zu senden

Angenommen, Sie haben ein Modell=anthropic/claude-3-5-sonnet-20240620 in der litellm proxy config.yaml

import openai
client = openai.AsyncOpenAI(
    api_key="anything",            # litellm proxy api key
    base_url="http://0.0.0.0:4000" # litellm proxy base url
)

response = await client.chat.completions.create(
    model="anthropic/claude-3-5-sonnet-20240620",
    messages = [{"role": "user", "content": "What's the weather like in Boston today?"}]
    tools = [
        {
            "type": "function",
            "function": {
                "name": "get_current_weather",
                "description": "Get the current weather in a given location",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "The city and state, e.g. San Francisco, CA",
                        },
                        "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
                    },
                    "required": ["location"],
                },
                "cache_control": {"type": "ephemeral"}
            },
        }
    ]
)

Caching - Fortsetzung von Multi-Turn-Konversationen

In diesem Beispiel demonstrieren wir, wie Prompt Caching in einer Multi-Turn-Konversation verwendet wird.

Der Parameter cache_control wird auf die Systemnachricht gesetzt, um sie als Teil des statischen Präfixes zu kennzeichnen.

Der Gesprächsverlauf (vorherige Nachrichten) ist im Array messages enthalten. Die letzte Runde ist mit cache-control gekennzeichnet, um bei Folgeanfragen weiterzunutzen. Die vorletzte Benutzernachricht ist mit dem Parameter cache_control zum Caching gekennzeichnet, damit dieser Kontrollpunkt aus dem vorherigen Cache lesen kann.

LiteLLM SDK
LiteLLM Proxy

import litellm

response = await litellm.acompletion(
    model="anthropic/claude-3-5-sonnet-20240620",
    messages=[
        # System Message
        {
            "role": "system",
            "content": [
                {
                    "type": "text",
                    "text": "Here is the full text of a complex legal agreement"
                    * 400,
                    "cache_control": {"type": "ephemeral"},
                }
            ],
        },
        # marked for caching with the cache_control parameter, so that this checkpoint can read from the previous cache.
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "What are the key terms and conditions in this agreement?",
                    "cache_control": {"type": "ephemeral"},
                }
            ],
        },
        {
            "role": "assistant",
            "content": "Certainly! the key terms and conditions are the following: the contract is 1 year long for $10/mo",
        },
        # The final turn is marked with cache-control, for continuing in followups.
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "What are the key terms and conditions in this agreement?",
                    "cache_control": {"type": "ephemeral"},
                }
            ],
        },
    ]
)

Info

LiteLLM Proxy ist OpenAI-kompatibel

Dies ist ein Beispiel, das das OpenAI Python SDK verwendet, um eine Anfrage an den LiteLLM Proxy zu senden

Angenommen, Sie haben ein Modell=anthropic/claude-3-5-sonnet-20240620 in der litellm proxy config.yaml

import openai
client = openai.AsyncOpenAI(
    api_key="anything",            # litellm proxy api key
    base_url="http://0.0.0.0:4000" # litellm proxy base url
)

response = await client.chat.completions.create(
    model="anthropic/claude-3-5-sonnet-20240620",
    messages=[
        # System Message
        {
            "role": "system",
            "content": [
                {
                    "type": "text",
                    "text": "Here is the full text of a complex legal agreement"
                    * 400,
                    "cache_control": {"type": "ephemeral"},
                }
            ],
        },
        # marked for caching with the cache_control parameter, so that this checkpoint can read from the previous cache.
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "What are the key terms and conditions in this agreement?",
                    "cache_control": {"type": "ephemeral"},
                }
            ],
        },
        {
            "role": "assistant",
            "content": "Certainly! the key terms and conditions are the following: the contract is 1 year long for $10/mo",
        },
        # The final turn is marked with cache-control, for continuing in followups.
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "What are the key terms and conditions in this agreement?",
                    "cache_control": {"type": "ephemeral"},
                }
            ],
        },
    ]
)

Funktions-/Tool-Aufrufe

Info

LiteLLM verwendet jetzt Anthropic's 'tool'-Parameter 🎉 (v1.34.29+)

from litellm import completion

# set env
os.environ["ANTHROPIC_API_KEY"] = "your-api-key"

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "Get the current weather in a given location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA",
                    },
                    "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
                },
                "required": ["location"],
            },
        },
    }
]
messages = [{"role": "user", "content": "What's the weather like in Boston today?"}]

response = completion(
    model="anthropic/claude-3-opus-20240229",
    messages=messages,
    tools=tools,
    tool_choice="auto",
)
# Add any assertions, here to check response args
print(response)
assert isinstance(response.choices[0].message.tool_calls[0].function.name, str)
assert isinstance(
    response.choices[0].message.tool_calls[0].function.arguments, str
)

Erzwingen der Anthropic-Tool-Nutzung

Wenn Sie möchten, dass Claude ein bestimmtes Tool zur Beantwortung der Benutzerfrage verwendet

Sie können dies tun, indem Sie das Tool im Feld tool_choice wie folgt angeben

response = completion(
    model="anthropic/claude-3-opus-20240229",
    messages=messages,
    tools=tools,
    tool_choice={"type": "tool", "name": "get_weather"},
)

Parallele Funktionsaufrufe

So übergeben Sie das Ergebnis eines Funktionsaufrufs an ein Anthropic-Modell

from litellm import completion
import os 

os.environ["ANTHROPIC_API_KEY"] = "sk-ant.."


litellm.set_verbose = True

### 1ST FUNCTION CALL ###
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "Get the current weather in a given location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA",
                    },
                    "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
                },
                "required": ["location"],
            },
        },
    }
]
messages = [
    {
        "role": "user",
        "content": "What's the weather like in Boston today in Fahrenheit?",
    }
]
try:
    # test without max tokens
    response = completion(
        model="anthropic/claude-3-opus-20240229",
        messages=messages,
        tools=tools,
        tool_choice="auto",
    )
    # Add any assertions, here to check response args
    print(response)
    assert isinstance(response.choices[0].message.tool_calls[0].function.name, str)
    assert isinstance(
        response.choices[0].message.tool_calls[0].function.arguments, str
    )

    messages.append(
        response.choices[0].message.model_dump()
    )  # Add assistant tool invokes
    tool_result = (
        '{"location": "Boston", "temperature": "72", "unit": "fahrenheit"}'
    )
    # Add user submitted tool results in the OpenAI format
    messages.append(
        {
            "tool_call_id": response.choices[0].message.tool_calls[0].id,
            "role": "tool",
            "name": response.choices[0].message.tool_calls[0].function.name,
            "content": tool_result,
        }
    )
    ### 2ND FUNCTION CALL ###
    # In the second response, Claude should deduce answer from tool results
    second_response = completion(
        model="anthropic/claude-3-opus-20240229",
        messages=messages,
        tools=tools,
        tool_choice="auto",
    )
    print(second_response)
except Exception as e:
    print(f"An error occurred - {str(e)}")

s/o @Shekhar Patnaik für die Anforderung dieses Features!

Von Anthropic gehostete Tools (Computer, Texteditor, Websuche)

Computer
Texteditor
Websuche

from litellm import completion

tools = [
    {
        "type": "computer_20241022",
        "function": {
            "name": "computer",
            "parameters": {
                "display_height_px": 100,
                "display_width_px": 100,
                "display_number": 1,
            },
        },
    }
]
model = "claude-3-5-sonnet-20241022"
messages = [{"role": "user", "content": "Save a picture of a cat to my desktop."}]

resp = completion(
    model=model,
    messages=messages,
    tools=tools,
    # headers={"anthropic-beta": "computer-use-2024-10-22"},
)

print(resp)

SDK
PROXY

from litellm import completion

tools = [{
    "type": "text_editor_20250124",
    "name": "str_replace_editor"
}]
model = "claude-3-5-sonnet-20241022"
messages = [{"role": "user", "content": "There's a syntax error in my primes.py file. Can you help me fix it?"}]

resp = completion(
    model=model,
    messages=messages,
    tools=tools,
)

print(resp)

Konfigurieren Sie config.yaml

- model_name: claude-3-5-sonnet-latest
  litellm_params:
    model: anthropic/claude-3-5-sonnet-latest
    api_key: os.environ/ANTHROPIC_API_KEY

Proxy starten

litellm --config /path/to/config.yaml

Testen Sie es!

curl http://0.0.0.0:4000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $LITELLM_KEY" \
  -d '{
    "model": "claude-3-5-sonnet-latest",
    "messages": [{"role": "user", "content": "There's a syntax error in my primes.py file. Can you help me fix it?"}],
    "tools": [{"type": "text_editor_20250124", "name": "str_replace_editor"}]
  }'

Info

Live ab v1.70.1+

LiteLLM ordnet den search_context_size-Parameter von OpenAI dem max_uses-Parameter von Anthropic zu.

OpenAI	Anthropic
Niedrig	1
Mittel	5
Hoch	10

SDK
PROXY

OpenAI-Format
Anthropic-Format

from litellm import completion

model = "claude-3-5-sonnet-20241022"
messages = [{"role": "user", "content": "What's the weather like today?"}]

resp = completion(
    model=model,
    messages=messages,
    web_search_options={
        "search_context_size": "medium",
        "user_location": {
            "type": "approximate",
            "approximate": {
                "city": "San Francisco",
            },
        }
    }
)

print(resp)

from litellm import completion

tools = [{
    "type": "web_search_20250305",
    "name": "web_search",
    "max_uses": 5
}]
model = "claude-3-5-sonnet-20241022"
messages = [{"role": "user", "content": "There's a syntax error in my primes.py file. Can you help me fix it?"}]

resp = completion(
    model=model,
    messages=messages,
    tools=tools,
)

print(resp)

Konfigurieren Sie config.yaml

- model_name: claude-3-5-sonnet-latest
  litellm_params:
    model: anthropic/claude-3-5-sonnet-latest
    api_key: os.environ/ANTHROPIC_API_KEY

Proxy starten

litellm --config /path/to/config.yaml

Testen Sie es!

OpenAI-Format
Anthropic-Format

curl http://0.0.0.0:4000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $LITELLM_KEY" \
  -d '{
    "model": "claude-3-5-sonnet-latest",
    "messages": [{"role": "user", "content": "What's the weather like today?"}],
    "web_search_options": {
        "search_context_size": "medium",
        "user_location": {
            "type": "approximate",
            "approximate": {
                "city": "San Francisco",
            },
        }
    }
  }'

curl http://0.0.0.0:4000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $LITELLM_KEY" \
  -d '{
    "model": "claude-3-5-sonnet-latest",
    "messages": [{"role": "user", "content": "What's the weather like today?"}],
    "tools": [{
        "type": "web_search_20250305",
        "name": "web_search",
        "max_uses": 5
    }]
  }'

Verwendung - Vision

from litellm import completion

# set env
os.environ["ANTHROPIC_API_KEY"] = "your-api-key"

def encode_image(image_path):
    import base64

    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode("utf-8")


image_path = "../proxy/cached_logo.jpg"
# Getting the base64 string
base64_image = encode_image(image_path)
resp = litellm.completion(
    model="anthropic/claude-3-opus-20240229",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Whats in this image?"},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "data:image/jpeg;base64," + base64_image
                    },
                },
            ],
        }
    ],
)
print(f"\nResponse: {resp}")

Verwendung - Denken / `reasoning_content`

LiteLLM übersetzt reasoning_effort von OpenAI in den thinking-Parameter von Anthropic. Code

reasoning_effort	Denken
"low"	"budget_tokens": 1024
"medium"	"budget_tokens": 2048
"high"	"budget_tokens": 4096

SDK
PROXY

from litellm import completion

resp = completion(
    model="anthropic/claude-3-7-sonnet-20250219",
    messages=[{"role": "user", "content": "What is the capital of France?"}],
    reasoning_effort="low",
)

Konfigurieren Sie config.yaml

- model_name: claude-3-7-sonnet-20250219
  litellm_params:
    model: anthropic/claude-3-7-sonnet-20250219
    api_key: os.environ/ANTHROPIC_API_KEY

Proxy starten

litellm --config /path/to/config.yaml

Testen Sie es!

curl http://0.0.0.0:4000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <YOUR-LITELLM-KEY>" \
  -d '{
    "model": "claude-3-7-sonnet-20250219",
    "messages": [{"role": "user", "content": "What is the capital of France?"}],
    "reasoning_effort": "low"
  }'

Erwartete Antwort

ModelResponse(
    id='chatcmpl-c542d76d-f675-4e87-8e5f-05855f5d0f5e',
    created=1740470510,
    model='claude-3-7-sonnet-20250219',
    object='chat.completion',
    system_fingerprint=None,
    choices=[
        Choices(
            finish_reason='stop',
            index=0,
            message=Message(
                content="The capital of France is Paris.",
                role='assistant',
                tool_calls=None,
                function_call=None,
                provider_specific_fields={
                    'citations': None,
                    'thinking_blocks': [
                        {
                            'type': 'thinking',
                            'thinking': 'The capital of France is Paris. This is a very straightforward factual question.',
                            'signature': 'EuYBCkQYAiJAy6...'
                        }
                    ]
                }
            ),
            thinking_blocks=[
                {
                    'type': 'thinking',
                    'thinking': 'The capital of France is Paris. This is a very straightforward factual question.',
                    'signature': 'EuYBCkQYAiJAy6AGB...'
                }
            ],
            reasoning_content='The capital of France is Paris. This is a very straightforward factual question.'
        )
    ],
    usage=Usage(
        completion_tokens=68,
        prompt_tokens=42,
        total_tokens=110,
        completion_tokens_details=None,
        prompt_tokens_details=PromptTokensDetailsWrapper(
            audio_tokens=None,
            cached_tokens=0,
            text_tokens=None,
            image_tokens=None
        ),
        cache_creation_input_tokens=0,
        cache_read_input_tokens=0
    )
)

Übergib `thinking` an Anthropic-Modelle

Sie können den thinking-Parameter auch an Anthropic-Modelle übergeben.

SDK
PROXY

response = litellm.completion(
  model="anthropic/claude-3-7-sonnet-20250219",
  messages=[{"role": "user", "content": "What is the capital of France?"}],
  thinking={"type": "enabled", "budget_tokens": 1024},
)

curl http://0.0.0.0:4000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $LITELLM_KEY" \
  -d '{
    "model": "anthropic/claude-3-7-sonnet-20250219",
    "messages": [{"role": "user", "content": "What is the capital of France?"}],
    "thinking": {"type": "enabled", "budget_tokens": 1024}
  }'

Übergabe zusätzlicher Header an die Anthropic API

Übergeben Sie extra_headers: dict an litellm.completion

from litellm import completion
messages = [{"role": "user", "content": "What is Anthropic?"}]
response = completion(
    model="claude-3-5-sonnet-20240620", 
    messages=messages, 
    extra_headers={"anthropic-beta": "max-tokens-3-5-sonnet-2024-07-15"}
)

Verwendung - "Assistant Pre-fill"

Sie können "Claude Worte in den Mund legen", indem Sie eine Nachricht mit der Rolle assistant als letztes Element im Array messages aufnehmen.

[!WICHTIG]Die zurückgegebene Completion enthält *nicht* Ihren "Pre-fill"-Text, da dieser Teil des Prompts selbst ist. Stellen Sie sicher, dass Sie Claudes Completion mit Ihrem Pre-fill voranstellen.

import os
from litellm import completion

# set env - [OPTIONAL] replace with your anthropic key
os.environ["ANTHROPIC_API_KEY"] = "your-api-key"

messages = [
    {"role": "user", "content": "How do you say 'Hello' in German? Return your answer as a JSON object, like this:\n\n{ \"Hello\": \"Hallo\" }"},
    {"role": "assistant", "content": "{"},
]
response = completion(model="claude-2.1", messages=messages)
print(response)

Beispiel-Prompt, der an Claude gesendet wird

Human: How do you say 'Hello' in German? Return your answer as a JSON object, like this:

{ "Hello": "Hallo" }

Assistant: {

Verwendung - "System"-Nachrichten

Wenn Sie Anthropic's Claude 2.1 verwenden, werden system-Rollen-Nachrichten für Sie korrekt formatiert.

import os
from litellm import completion

# set env - [OPTIONAL] replace with your anthropic key
os.environ["ANTHROPIC_API_KEY"] = "your-api-key"

messages = [
    {"role": "system", "content": "You are a snarky assistant."},
    {"role": "user", "content": "How do I boil water?"},
]
response = completion(model="claude-2.1", messages=messages)

Beispiel-Prompt, der an Claude gesendet wird

You are a snarky assistant.

Human: How do I boil water?

Assistant:

Verwendung - PDF

Übergeben Sie base64-kodierte PDF-Dateien an Anthropic-Modelle über das Feld image_url.

SDK
PROXY

Verwendung von Base64

from litellm import completion, supports_pdf_input
import base64
import requests

# URL of the file
url = "https://storage.googleapis.com/cloud-samples-data/generative-ai/pdf/2403.05530.pdf"

# Download the file
response = requests.get(url)
file_data = response.content

encoded_file = base64.b64encode(file_data).decode("utf-8")

## check if model supports pdf input - (2024/11/11) only claude-3-5-haiku-20241022 supports it
supports_pdf_input("anthropic/claude-3-5-haiku-20241022") # True

response = completion(
    model="anthropic/claude-3-5-haiku-20241022",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "You are a very professional document summarization specialist. Please summarize the given document."},
                {
                    "type": "file",
                    "file": {
                       "file_data": f"data:application/pdf;base64,{encoded_file}", # 👈 PDF
                    }
                },
            ],
        }
    ],
    max_tokens=300,
)

print(response.choices[0])

Modell zur Konfiguration hinzufügen

- model_name: claude-3-5-haiku-20241022
  litellm_params:
    model: anthropic/claude-3-5-haiku-20241022
    api_key: os.environ/ANTHROPIC_API_KEY

Proxy starten

litellm --config /path/to/config.yaml

Testen Sie es!

curl http://0.0.0.0:4000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <YOUR-LITELLM-KEY>" \
  -d '{
    "model": "claude-3-5-haiku-20241022",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "You are a very professional document summarization specialist. Please summarize the given document"
          },
          {
                "type": "file",
                "file": {
                    "file_data": f"data:application/pdf;base64,{encoded_file}", # 👈 PDF
                }
            }
          }
        ]
      }
    ],
    "max_tokens": 300
  }'

[BETA]Citations API

Übergeben Sie citations: {"enabled": true} an Anthropic, um Zitate in Ihren Dokumentenantworten zu erhalten.

Hinweis: Diese Schnittstelle ist im BETA-Stadium. Wenn Sie Feedback haben, wie Zitate zurückgegeben werden sollen, teilen Sie es uns hier mit

SDK
PROXY

from litellm import completion

resp = completion(
    model="claude-3-5-sonnet-20241022",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "document",
                    "source": {
                        "type": "text",
                        "media_type": "text/plain",
                        "data": "The grass is green. The sky is blue.",
                    },
                    "title": "My Document",
                    "context": "This is a trustworthy document.",
                    "citations": {"enabled": True},
                },
                {
                    "type": "text",
                    "text": "What color is the grass and sky?",
                },
            ],
        }
    ],
)

citations = resp.choices[0].message.provider_specific_fields["citations"]

assert citations is not None

Konfigurieren Sie config.yaml

model_list:
    - model_name: anthropic-claude
      litellm_params:
        model: anthropic/claude-3-5-sonnet-20241022
        api_key: os.environ/ANTHROPIC_API_KEY

Proxy starten

litellm --config /path/to/config.yaml

# RUNNING on http://0.0.0.0:4000

Testen Sie es!

curl -L -X POST 'http://0.0.0.0:4000/v1/chat/completions' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer sk-1234' \
-d '{
  "model": "anthropic-claude",
  "messages": [
    {
        "role": "user",
        "content": [
            {
                "type": "document",
                "source": {
                    "type": "text",
                    "media_type": "text/plain",
                    "data": "The grass is green. The sky is blue.",
                },
                "title": "My Document",
                "context": "This is a trustworthy document.",
                "citations": {"enabled": True},
            },
            {
                "type": "text",
                "text": "What color is the grass and sky?",
            },
        ],
    }
  ]
}'

Verwendung - Übergabe von 'user_id' an Anthropic

LiteLLM übersetzt den user-Parameter von OpenAI in den metadata[user_id]-Parameter von Anthropic.

SDK
PROXY

response = completion(
    model="claude-3-5-sonnet-20240620",
    messages=messages,
    user="user_123",
)

Konfigurieren Sie config.yaml

model_list:
    - model_name: claude-3-5-sonnet-20240620
      litellm_params:
        model: anthropic/claude-3-5-sonnet-20240620
        api_key: os.environ/ANTHROPIC_API_KEY

Proxy starten

litellm --config /path/to/config.yaml

Testen Sie es!

curl http://0.0.0.0:4000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <YOUR-LITELLM-KEY>" \
  -d '{
    "model": "claude-3-5-sonnet-20240620",
    "messages": [{"role": "user", "content": "What is Anthropic?"}],
    "user": "user_123"
  }'

Anthropic

Unterstützte OpenAI-Parameter​

API-Schlüssel​

Verwendung​

Verwendung - Streaming​

Verwendung mit LiteLLM Proxy​

1. Speichern Sie den Schlüssel in Ihrer Umgebung​

2. Proxy starten​

Erforderliche Umgebungsvariablen​

3. Testen​

Unterstützte Modelle​

Prompt-Caching​

Caching - Large Context Caching​

Caching - Tool-Definitionen​

Caching - Fortsetzung von Multi-Turn-Konversationen​

Funktions-/Tool-Aufrufe​

Erzwingen der Anthropic-Tool-Nutzung​

Parallele Funktionsaufrufe​

Von Anthropic gehostete Tools (Computer, Texteditor, Websuche)​

Verwendung - Vision​

Verwendung - Denken / reasoning_content​

Übergib thinking an Anthropic-Modelle​

Übergabe zusätzlicher Header an die Anthropic API​

Verwendung - "Assistant Pre-fill"​

Beispiel-Prompt, der an Claude gesendet wird​

Verwendung - "System"-Nachrichten​

Beispiel-Prompt, der an Claude gesendet wird​

Verwendung - PDF​

Verwendung von Base64​

[BETA]Citations API​

Verwendung - Übergabe von 'user_id' an Anthropic​

Unterstützte OpenAI-Parameter

API-Schlüssel

Verwendung

Verwendung - Streaming

Verwendung mit LiteLLM Proxy

1. Speichern Sie den Schlüssel in Ihrer Umgebung

2. Proxy starten

Erforderliche Umgebungsvariablen

3. Testen

Unterstützte Modelle

Prompt-Caching

Caching - Large Context Caching

Caching - Tool-Definitionen

Caching - Fortsetzung von Multi-Turn-Konversationen

Funktions-/Tool-Aufrufe

Erzwingen der Anthropic-Tool-Nutzung

Parallele Funktionsaufrufe

Von Anthropic gehostete Tools (Computer, Texteditor, Websuche)

Verwendung - Vision

Verwendung - Denken / `reasoning_content`

Übergib `thinking` an Anthropic-Modelle

Übergabe zusätzlicher Header an die Anthropic API

Verwendung - "Assistant Pre-fill"

Beispiel-Prompt, der an Claude gesendet wird

Verwendung - "System"-Nachrichten

Beispiel-Prompt, der an Claude gesendet wird

Verwendung - PDF

Verwendung von Base64

[BETA]Citations API

Verwendung - Übergabe von 'user_id' an Anthropic