Databricks

LiteLLM unterstützt alle Modelle auf Databricks

Tipp

Wir unterstützen ALLE Databricks-Modelle. Setzen Sie einfach model=databricks/<any-model-on-databricks> als Präfix, wenn Sie LiteLLM-Anfragen senden.

Verwendung

SDK
PROXY

UMGEBUNGSVARIABLE

import os 
os.environ["DATABRICKS_API_KEY"] = ""
os.environ["DATABRICKS_API_BASE"] = ""

Beispielaufruf

from litellm import completion
import os
## set ENV variables
os.environ["DATABRICKS_API_KEY"] = "databricks key"
os.environ["DATABRICKS_API_BASE"] = "databricks base url" # e.g.: https://adb-3064715882934586.6.azuredatabricks.net/serving-endpoints

# Databricks dbrx-instruct call
response = completion(
    model="databricks/databricks-dbrx-instruct", 
    messages = [{ "content": "Hello, how are you?","role": "user"}]
)

Modelle zu Ihrer config.yaml hinzufügen

model_list:
  - model_name: dbrx-instruct
    litellm_params:
      model: databricks/databricks-dbrx-instruct
      api_key: os.environ/DATABRICKS_API_KEY
      api_base: os.environ/DATABRICKS_API_BASE

Starten Sie den Proxy

$ litellm --config /path/to/config.yaml --debug

Anfrage an LiteLLM Proxy Server senden

OpenAI Python v1.0.0+
curl

import openai
client = openai.OpenAI(
    api_key="sk-1234",             # pass litellm proxy key, if you're using virtual keys
    base_url="http://0.0.0.0:4000" # litellm-proxy-base url
)

response = client.chat.completions.create(
    model="dbrx-instruct",
    messages = [
      {
          "role": "system",
          "content": "Be a good human!"
      },
      {
          "role": "user",
          "content": "What do you know about earth?"
      }
  ]
)

print(response)

curl --location 'http://0.0.0.0:4000/chat/completions' \
    --header 'Authorization: Bearer sk-1234' \
    --header 'Content-Type: application/json' \
    --data '{
    "model": "dbrx-instruct",
    "messages": [
      {
          "role": "system",
          "content": "Be a good human!"
      },
      {
          "role": "user",
          "content": "What do you know about earth?"
      }
      ],
}'

Übergabe zusätzlicher Parameter - max_tokens, temperature

Alle von litellm.completion unterstützten Parameter finden Sie hier

# !pip install litellm
from litellm import completion
import os
## set ENV variables
os.environ["DATABRICKS_API_KEY"] = "databricks key"
os.environ["DATABRICKS_API_BASE"] = "databricks api base"

# databricks dbrx call
response = completion(
    model="databricks/databricks-dbrx-instruct", 
    messages = [{ "content": "Hello, how are you?","role": "user"}],
    max_tokens=20,
    temperature=0.5
)

Proxy

  model_list:
    - model_name: llama-3
      litellm_params:
        model: databricks/databricks-meta-llama-3-70b-instruct
        api_key: os.environ/DATABRICKS_API_KEY
        max_tokens: 20
        temperature: 0.5

Verwendung - Thinking / `reasoning_content`

LiteLLM übersetzt den Parameter reasoning_effort von OpenAI in den Parameter thinking von Anthropic. Code

reasoning_effort	Denken
"low"	"budget_tokens": 1024
"medium"	"budget_tokens": 2048
"high"	"budget_tokens": 4096

Bekannte Einschränkungen

Unterstützung für die Rückgabe von Denkblöcken an Claude Issue

SDK
PROXY

from litellm import completion
import os

# set ENV variables (can also be passed in to .completion() - e.g. `api_base`, `api_key`)
os.environ["DATABRICKS_API_KEY"] = "databricks key"
os.environ["DATABRICKS_API_BASE"] = "databricks base url"

resp = completion(
    model="databricks/databricks-claude-3-7-sonnet",
    messages=[{"role": "user", "content": "What is the capital of France?"}],
    reasoning_effort="low",
)

Konfigurieren Sie config.yaml

- model_name: claude-3-7-sonnet
  litellm_params:
    model: databricks/databricks-claude-3-7-sonnet
    api_key: os.environ/DATABRICKS_API_KEY
    api_base: os.environ/DATABRICKS_API_BASE

Proxy starten

litellm --config /path/to/config.yaml

Testen Sie es!

curl http://0.0.0.0:4000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <YOUR-LITELLM-KEY>" \
  -d '{
    "model": "claude-3-7-sonnet",
    "messages": [{"role": "user", "content": "What is the capital of France?"}],
    "reasoning_effort": "low"
  }'

Erwartete Antwort

ModelResponse(
    id='chatcmpl-c542d76d-f675-4e87-8e5f-05855f5d0f5e',
    created=1740470510,
    model='claude-3-7-sonnet-20250219',
    object='chat.completion',
    system_fingerprint=None,
    choices=[
        Choices(
            finish_reason='stop',
            index=0,
            message=Message(
                content="The capital of France is Paris.",
                role='assistant',
                tool_calls=None,
                function_call=None,
                provider_specific_fields={
                    'citations': None,
                    'thinking_blocks': [
                        {
                            'type': 'thinking',
                            'thinking': 'The capital of France is Paris. This is a very straightforward factual question.',
                            'signature': 'EuYBCkQYAiJAy6...'
                        }
                    ]
                }
            ),
            thinking_blocks=[
                {
                    'type': 'thinking',
                    'thinking': 'The capital of France is Paris. This is a very straightforward factual question.',
                    'signature': 'EuYBCkQYAiJAy6AGB...'
                }
            ],
            reasoning_content='The capital of France is Paris. This is a very straightforward factual question.'
        )
    ],
    usage=Usage(
        completion_tokens=68,
        prompt_tokens=42,
        total_tokens=110,
        completion_tokens_details=None,
        prompt_tokens_details=PromptTokensDetailsWrapper(
            audio_tokens=None,
            cached_tokens=0,
            text_tokens=None,
            image_tokens=None
        ),
        cache_creation_input_tokens=0,
        cache_read_input_tokens=0
    )
)

Übergib `thinking` an Anthropic-Modelle

Sie können den thinking-Parameter auch an Anthropic-Modelle übergeben.

SDK
PROXY

from litellm import completion
import os

# set ENV variables (can also be passed in to .completion() - e.g. `api_base`, `api_key`)
os.environ["DATABRICKS_API_KEY"] = "databricks key"
os.environ["DATABRICKS_API_BASE"] = "databricks base url"

response = litellm.completion(
  model="databricks/databricks-claude-3-7-sonnet",
  messages=[{"role": "user", "content": "What is the capital of France?"}],
  thinking={"type": "enabled", "budget_tokens": 1024},
)

curl http://0.0.0.0:4000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $LITELLM_KEY" \
  -d '{
    "model": "databricks/databricks-claude-3-7-sonnet",
    "messages": [{"role": "user", "content": "What is the capital of France?"}],
    "thinking": {"type": "enabled", "budget_tokens": 1024}
  }'

Unterstützte Databricks Chat Completion Modelle

Tipp

Wir unterstützen ALLE Databricks-Modelle. Setzen Sie einfach model=databricks/<any-model-on-databricks> als Präfix, wenn Sie LiteLLM-Anfragen senden.

Modellname	Befehl
databricks/databricks-claude-3-7-sonnet	`completion(model='databricks/databricks-claude-3-7-sonnet', messages=messages)`
databricks-meta-llama-3-1-70b-instruct	`completion(model='databricks/databricks-meta-llama-3-1-70b-instruct', messages=messages)`
databricks-meta-llama-3-1-405b-instruct	`completion(model='databricks/databricks-meta-llama-3-1-405b-instruct', messages=messages)`
databricks-dbrx-instruct	`completion(model='databricks/databricks-dbrx-instruct', messages=messages)`
databricks-meta-llama-3-70b-instruct	`completion(model='databricks/databricks-meta-llama-3-70b-instruct', messages=messages)`
databricks-llama-2-70b-chat	`completion(model='databricks/databricks-llama-2-70b-chat', messages=messages)`
databricks-mixtral-8x7b-instruct	`completion(model='databricks/databricks-mixtral-8x7b-instruct', messages=messages)`
databricks-mpt-30b-instruct	`completion(model='databricks/databricks-mpt-30b-instruct', messages=messages)`
databricks-mpt-7b-instruct	`completion(model='databricks/databricks-mpt-7b-instruct', messages=messages)`

Embedding Modelle

Übergabe von Databricks-spezifischen Parametern - 'instruction'

Für Embedding-Modelle können Sie mit Databricks einen zusätzlichen Parameter 'instruction' übergeben. Vollständige Spezifikation

# !pip install litellm
from litellm import embedding
import os
## set ENV variables
os.environ["DATABRICKS_API_KEY"] = "databricks key"
os.environ["DATABRICKS_API_BASE"] = "databricks url"

# Databricks bge-large-en call
response = litellm.embedding(
      model="databricks/databricks-bge-large-en",
      input=["good morning from litellm"],
      instruction="Represent this sentence for searching relevant passages:",
  )

Proxy

  model_list:
    - model_name: bge-large
      litellm_params:
        model: databricks/databricks-bge-large-en
        api_key: os.environ/DATABRICKS_API_KEY
        api_base: os.environ/DATABRICKS_API_BASE
        instruction: "Represent this sentence for searching relevant passages:"

Unterstützte Databricks Embedding Modelle

Tipp

Wir unterstützen ALLE Databricks-Modelle. Setzen Sie einfach model=databricks/<any-model-on-databricks> als Präfix, wenn Sie LiteLLM-Anfragen senden.

Modellname	Befehl
databricks-bge-large-en	`embedding(model='databricks/databricks-bge-large-en', messages=messages)`
databricks-gte-large-en	`embedding(model='databricks/databricks-gte-large-en', messages=messages)`

Databricks

Verwendung​

UMGEBUNGSVARIABLE​

Beispielaufruf​

Übergabe zusätzlicher Parameter - max_tokens, temperature​

Verwendung - Thinking / reasoning_content​

Übergib thinking an Anthropic-Modelle​

Unterstützte Databricks Chat Completion Modelle​

Embedding Modelle​

Übergabe von Databricks-spezifischen Parametern - 'instruction'​

Unterstützte Databricks Embedding Modelle​

Verwendung

UMGEBUNGSVARIABLE

Beispielaufruf

Übergabe zusätzlicher Parameter - max_tokens, temperature

Verwendung - Thinking / `reasoning_content`

Übergib `thinking` an Anthropic-Modelle

Unterstützte Databricks Chat Completion Modelle

Embedding Modelle

Übergabe von Databricks-spezifischen Parametern - 'instruction'

Unterstützte Databricks Embedding Modelle