Strukturierte Ausgaben (JSON-Modus)

Schnellstart

SDK
PROXY

from litellm import completion
import os 

os.environ["OPENAI_API_KEY"] = ""

response = completion(
  model="gpt-4o-mini",
  response_format={ "type": "json_object" },
  messages=[
    {"role": "system", "content": "You are a helpful assistant designed to output JSON."},
    {"role": "user", "content": "Who won the world series in 2020?"}
  ]
)
print(response.choices[0].message.content)

curl http://0.0.0.0:4000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $LITELLM_KEY" \
  -d '{
    "model": "gpt-4o-mini",
    "response_format": { "type": "json_object" },
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant designed to output JSON."
      },
      {
        "role": "user",
        "content": "Who won the world series in 2020?"
      }
    ]
  }'

Modellunterstützung prüfen

1. Prüfen Sie, ob das Modell `response_format` unterstützt

Rufen Sie litellm.get_supported_openai_params auf, um zu prüfen, ob ein Modell/Anbieter response_format unterstützt.

from litellm import get_supported_openai_params

params = get_supported_openai_params(model="anthropic.claude-3", custom_llm_provider="bedrock")

assert "response_format" in params

2. Prüfen Sie, ob das Modell `json_schema` unterstützt

Dies wird verwendet, um zu prüfen, ob Sie übergeben können

response_format={ "type": "json_schema", "json_schema": … , "strict": true }
response_format=<Pydantic Model>

from litellm import supports_response_schema

assert supports_response_schema(model="gemini-1.5-pro-preview-0215", custom_llm_provider="bedrock")

Schauen Sie sich model_prices_and_context_window.json für eine vollständige Liste der Modelle und ihrer Unterstützung für response_schema an.

Übergeben Sie 'json_schema'

Um strukturierte Ausgaben zu verwenden, geben Sie einfach an

response_format: { "type": "json_schema", "json_schema": … , "strict": true }

Funktioniert für

OpenAI-Modelle
Azure OpenAI-Modelle
xAI-Modelle (Grok-2 oder neuer)
Google AI Studio - Gemini-Modelle
Vertex AI-Modelle (Gemini + Anthropic)
Bedrock-Modelle
Anthropic API-Modelle
Groq-Modelle
Ollama-Modelle
Databricks-Modelle

SDK
PROXY

import os
from litellm import completion 
from pydantic import BaseModel

# add to env var 
os.environ["OPENAI_API_KEY"] = ""

messages = [{"role": "user", "content": "List 5 important events in the XIX century"}]

class CalendarEvent(BaseModel):
  name: str
  date: str
  participants: list[str]

class EventsList(BaseModel):
    events: list[CalendarEvent]

resp = completion(
    model="gpt-4o-2024-08-06",
    messages=messages,
    response_format=EventsList
)

print("Received={}".format(resp))

Fügen Sie das OpenAI-Modell zur config.yaml hinzu

model_list:
  - model_name: "gpt-4o"
    litellm_params:
      model: "gpt-4o-2024-08-06"

Starten Sie den Proxy mit config.yaml

litellm --config /path/to/config.yaml

Aufruf mit OpenAI SDK / Curl!

Ersetzen Sie einfach die 'base_url' im OpenAI SDK, um den Proxy mit 'json_schema' für OpenAI-Modelle aufzurufen

OpenAI SDK

from pydantic import BaseModel
from openai import OpenAI

client = OpenAI(
    api_key="anything", # 👈 PROXY KEY (can be anything, if master_key not set)
    base_url="http://0.0.0.0:4000" # 👈 PROXY BASE URL
)

class Step(BaseModel):
    explanation: str
    output: str

class MathReasoning(BaseModel):
    steps: list[Step]
    final_answer: str

completion = client.beta.chat.completions.parse(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful math tutor. Guide the user through the solution step by step."},
        {"role": "user", "content": "how can I solve 8x + 7 = -23"}
    ],
    response_format=MathReasoning,
)

math_reasoning = completion.choices[0].message.parsed

Curl

curl -X POST 'http://0.0.0.0:4000/v1/chat/completions' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer sk-1234' \
-d '{
    "model": "gpt-4o",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful math tutor. Guide the user through the solution step by step."
      },
      {
        "role": "user",
        "content": "how can I solve 8x + 7 = -23"
      }
    ],
    "response_format": {
      "type": "json_schema",
      "json_schema": {
        "name": "math_reasoning",
        "schema": {
          "type": "object",
          "properties": {
            "steps": {
              "type": "array",
              "items": {
                "type": "object",
                "properties": {
                  "explanation": { "type": "string" },
                  "output": { "type": "string" }
                },
                "required": ["explanation", "output"],
                "additionalProperties": false
              }
            },
            "final_answer": { "type": "string" }
          },
          "required": ["steps", "final_answer"],
          "additionalProperties": false
        },
        "strict": true
      }
    }
  }'

JSON-Schema validieren

Nicht alle Vertex-Modelle unterstützen die Übergabe des json_schema (z. B. gemini-1.5-flash). Um dies zu lösen, unterstützt LiteLLM die clientseitige Validierung des json_schema.

litellm.enable_json_schema_validation=True

Wenn litellm.enable_json_schema_validation=True gesetzt ist, validiert LiteLLM die JSON-Antwort mit jsonvalidator.

Code anzeigen

SDK
PROXY

# !gcloud auth application-default login - run this to add vertex credentials to your env
import litellm, os
from litellm import completion 
from pydantic import BaseModel 


messages=[
        {"role": "system", "content": "Extract the event information."},
        {"role": "user", "content": "Alice and Bob are going to a science fair on Friday."},
    ]

litellm.enable_json_schema_validation = True
litellm.set_verbose = True # see the raw request made by litellm

class CalendarEvent(BaseModel):
  name: str
  date: str
  participants: list[str]

resp = completion(
    model="gemini/gemini-1.5-pro",
    messages=messages,
    response_format=CalendarEvent,
)

print("Received={}".format(resp))

Erstellen Sie config.yaml

model_list:
  - model_name: "gemini-1.5-flash"
    litellm_params:
      model: "gemini/gemini-1.5-flash"
      api_key: os.environ/GEMINI_API_KEY

litellm_settings:
  enable_json_schema_validation: True

Proxy starten

litellm --config /path/to/config.yaml

Testen Sie es!

curl http://0.0.0.0:4000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $LITELLM_API_KEY" \
  -d '{
    "model": "gemini-1.5-flash",
    "messages": [
        {"role": "system", "content": "Extract the event information."},
        {"role": "user", "content": "Alice and Bob are going to a science fair on Friday."},
    ],
    "response_format": { 
        "type": "json_object",
        "response_schema": { 
            "type": "json_schema",
            "json_schema": {
              "name": "math_reasoning",
              "schema": {
                "type": "object",
                "properties": {
                  "steps": {
                    "type": "array",
                    "items": {
                      "type": "object",
                      "properties": {
                        "explanation": { "type": "string" },
                        "output": { "type": "string" }
                      },
                      "required": ["explanation", "output"],
                      "additionalProperties": false
                    }
                  },
                  "final_answer": { "type": "string" }
                },
                "required": ["steps", "final_answer"],
                "additionalProperties": false
              },
              "strict": true
            },
        }
    },
  }'

Strukturierte Ausgaben (JSON-Modus)

Schnellstart​

Modellunterstützung prüfen​

1. Prüfen Sie, ob das Modell response_format unterstützt​

2. Prüfen Sie, ob das Modell json_schema unterstützt​

Übergeben Sie 'json_schema'​

JSON-Schema validieren​