Zum Hauptinhalt springen

Batch-Verarbeitung von Completion()

LiteLLM ermöglicht Ihnen

  • Viele Completion-Aufrufe an 1 Modell senden
  • 1 Completion-Aufruf an viele Modelle senden: Schnellste Antwort zurückgeben
  • 1 Completion-Aufruf an viele Modelle senden: Alle Antworten zurückgeben
Info

Versuchen Sie, Batch-Completion mit dem LiteLLM Proxy durchzuführen? Gehen Sie hierhin: https://docs.litellm.de/docs/proxy/user_keys#beta-batch-completions---pass-model-as-list

Mehrere Completion-Aufrufe an 1 Modell senden​

In der `batch_completion`-Methode stellen Sie eine Liste von messages bereit, wobei jede Unterliste von Nachrichten an litellm.completion() übergeben wird, was Ihnen ermöglicht, mehrere Prompts effizient in einem einzigen API-Aufruf zu verarbeiten.

Open In Colab

Beispielcode​

import litellm
import os
from litellm import batch_completion

os.environ['ANTHROPIC_API_KEY'] = ""


responses = batch_completion(
model="claude-2",
messages = [
[
{
"role": "user",
"content": "good morning? "
}
],
[
{
"role": "user",
"content": "what's the time? "
}
]
]
)

1 Completion-Aufruf an viele Modelle senden: Schnellste Antwort zurückgeben​

Dies führt parallele Aufrufe an die angegebenen models durch und gibt die erste Antwort zurück.

Verwenden Sie dies, um die Latenz zu reduzieren.

Beispielcode​

import litellm
import os
from litellm import batch_completion_models

os.environ['ANTHROPIC_API_KEY'] = ""
os.environ['OPENAI_API_KEY'] = ""
os.environ['COHERE_API_KEY'] = ""

response = batch_completion_models(
models=["gpt-3.5-turbo", "claude-instant-1.2", "command-nightly"],
messages=[{"role": "user", "content": "Hey, how's it going"}]
)
print(result)

Ausgabe​

Gibt die erste Antwort im OpenAI-Format zurück. Bricht andere LLM-API-Aufrufe ab.

{
"object": "chat.completion",
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": " I'm doing well, thanks for asking! I'm an AI assistant created by Anthropic to be helpful, harmless, and honest.",
"role": "assistant",
"logprobs": null
}
}
],
"id": "chatcmpl-23273eed-e351-41be-a492-bafcf5cf3274",
"created": 1695154628.2076092,
"model": "command-nightly",
"usage": {
"prompt_tokens": 6,
"completion_tokens": 14,
"total_tokens": 20
}
}

1 Completion-Aufruf an viele Modelle senden: Alle Antworten zurückgeben​

Dies führt parallele Aufrufe an die angegebenen Modelle durch und gibt alle Antworten zurück.

Verwenden Sie dies, um Anfragen parallel zu verarbeiten und Antworten von mehreren Modellen zu erhalten.

Beispielcode​

import litellm
import os
from litellm import batch_completion_models_all_responses

os.environ['ANTHROPIC_API_KEY'] = ""
os.environ['OPENAI_API_KEY'] = ""
os.environ['COHERE_API_KEY'] = ""

responses = batch_completion_models_all_responses(
models=["gpt-3.5-turbo", "claude-instant-1.2", "command-nightly"],
messages=[{"role": "user", "content": "Hey, how's it going"}]
)
print(responses)

Ausgabe​

[<ModelResponse chat.completion id=chatcmpl-e673ec8e-4e8f-4c9e-bf26-bf9fa7ee52b9 at 0x103a62160> JSON: {
"object": "chat.completion",
"choices": [
{
"finish_reason": "stop_sequence",
"index": 0,
"message": {
"content": " It's going well, thank you for asking! How about you?",
"role": "assistant",
"logprobs": null
}
}
],
"id": "chatcmpl-e673ec8e-4e8f-4c9e-bf26-bf9fa7ee52b9",
"created": 1695222060.917964,
"model": "claude-instant-1.2",
"usage": {
"prompt_tokens": 14,
"completion_tokens": 9,
"total_tokens": 23
}
}, <ModelResponse chat.completion id=chatcmpl-ab6c5bd3-b5d9-4711-9697-e28d9fb8a53c at 0x103a62b60> JSON: {
"object": "chat.completion",
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": " It's going well, thank you for asking! How about you?",
"role": "assistant",
"logprobs": null
}
}
],
"id": "chatcmpl-ab6c5bd3-b5d9-4711-9697-e28d9fb8a53c",
"created": 1695222061.0445492,
"model": "command-nightly",
"usage": {
"prompt_tokens": 6,
"completion_tokens": 14,
"total_tokens": 20
}
}, <OpenAIObject chat.completion id=chatcmpl-80szFnKHzCxObW0RqCMw1hWW1Icrq at 0x102dd6430> JSON: {
"id": "chatcmpl-80szFnKHzCxObW0RqCMw1hWW1Icrq",
"object": "chat.completion",
"created": 1695222061,
"model": "gpt-3.5-turbo-0613",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm an AI language model, so I don't have feelings, but I'm here to assist you with any questions or tasks you might have. How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 13,
"completion_tokens": 39,
"total_tokens": 52
}
}]