Benutzerdefinierter Guardrail

Verwenden Sie dies, wenn Sie Code zum Ausführen eines benutzerdefinierten Guardrails schreiben möchten

Schnellstart

1. Schreiben Sie eine `CustomGuardrail` Klasse

Ein CustomGuardrail hat 4 Methoden zur Durchsetzung von Guardrails

async_pre_call_hook - (Optional) Eingabe ändern oder Anfrage ablehnen, bevor ein LLM-API-Aufruf erfolgt
async_moderation_hook - (Optional) Anfrage ablehnen, läuft während des LLM-API-Aufrufs (hilft Latenz zu reduzieren)
async_post_call_success_hook- (Optional) Guardrail auf Eingabe/Ausgabe anwenden, läuft nach dem LLM-API-Aufruf
async_post_call_streaming_iterator_hook - (Optional) den gesamten Stream an den Guardrail übergeben

Sehen Sie hier die detaillierte Spezifikation der Methoden

Beispiel CustomGuardrail Klasse

Erstellen Sie eine neue Datei namens custom_guardrail.py und fügen Sie diesen Code hinzu

from typing import Any, Dict, List, Literal, Optional, Union

import litellm
from litellm._logging import verbose_proxy_logger
from litellm.caching.caching import DualCache
from litellm.integrations.custom_guardrail import CustomGuardrail
from litellm.proxy._types import UserAPIKeyAuth
from litellm.proxy.guardrails.guardrail_helpers import should_proceed_based_on_metadata
from litellm.types.guardrails import GuardrailEventHooks


class myCustomGuardrail(CustomGuardrail):
    def __init__(
        self,
        **kwargs,
    ):
        # store kwargs as optional_params
        self.optional_params = kwargs

        super().__init__(**kwargs)

    async def async_pre_call_hook(
        self,
        user_api_key_dict: UserAPIKeyAuth,
        cache: DualCache,
        data: dict,
        call_type: Literal[
            "completion",
            "text_completion",
            "embeddings",
            "image_generation",
            "moderation",
            "audio_transcription",
            "pass_through_endpoint",
            "rerank"
        ],
    ) -> Optional[Union[Exception, str, dict]]:
        """
        Runs before the LLM API call
        Runs on only Input
        Use this if you want to MODIFY the input
        """

        # In this guardrail, if a user inputs `litellm` we will mask it and then send it to the LLM
        _messages = data.get("messages")
        if _messages:
            for message in _messages:
                _content = message.get("content")
                if isinstance(_content, str):
                    if "litellm" in _content.lower():
                        _content = _content.replace("litellm", "********")
                        message["content"] = _content

        verbose_proxy_logger.debug(
            "async_pre_call_hook: Message after masking %s", _messages
        )

        return data

    async def async_moderation_hook(
        self,
        data: dict,
        user_api_key_dict: UserAPIKeyAuth,
        call_type: Literal["completion", "embeddings", "image_generation", "moderation", "audio_transcription"],
    ):
        """
        Runs in parallel to LLM API call
        Runs on only Input

        This can NOT modify the input, only used to reject or accept a call before going to LLM API
        """

        # this works the same as async_pre_call_hook, but just runs in parallel as the LLM API Call
        # In this guardrail, if a user inputs `litellm` we will mask it.
        _messages = data.get("messages")
        if _messages:
            for message in _messages:
                _content = message.get("content")
                if isinstance(_content, str):
                    if "litellm" in _content.lower():
                        raise ValueError("Guardrail failed words - `litellm` detected")

    async def async_post_call_success_hook(
        self,
        data: dict,
        user_api_key_dict: UserAPIKeyAuth,
        response,
    ):
        """
        Runs on response from LLM API call

        It can be used to reject a response

        If a response contains the word "coffee" -> we will raise an exception
        """
        verbose_proxy_logger.debug("async_pre_call_hook response: %s", response)
        if isinstance(response, litellm.ModelResponse):
            for choice in response.choices:
                if isinstance(choice, litellm.Choices):
                    verbose_proxy_logger.debug("async_pre_call_hook choice: %s", choice)
                    if (
                        choice.message.content
                        and isinstance(choice.message.content, str)
                        and "coffee" in choice.message.content
                    ):
                        raise ValueError("Guardrail failed Coffee Detected")

    async def async_post_call_streaming_iterator_hook(
        self,
        user_api_key_dict: UserAPIKeyAuth,
        response: Any,
        request_data: dict,
    ) -> AsyncGenerator[ModelResponseStream, None]:
        """
        Passes the entire stream to the guardrail

        This is useful for guardrails that need to see the entire response, such as PII masking.

        See Aim guardrail implementation for an example - https://github.com/BerriAI/litellm/blob/d0e022cfacb8e9ebc5409bb652059b6fd97b45c0/litellm/proxy/guardrails/guardrail_hooks/aim.py#L168

        Triggered by mode: 'post_call'
        """
        async for item in response:
            yield item

2. Übergeben Sie Ihre benutzerdefinierte Guardrail-Klasse in der LiteLLM `config.yaml`

In der folgenden Konfiguration verweisen wir auf den Guardrail zu unserem benutzerdefinierten Guardrail, indem wir guardrail: custom_guardrail.myCustomGuardrail setzen

Python-Dateiname: custom_guardrail.py
Name der Guardrail-Klasse: myCustomGuardrail. Dies ist in Schritt 1 definiert

guardrail: custom_guardrail.myCustomGuardrail

model_list:
  - model_name: gpt-4
    litellm_params:
      model: openai/gpt-4o
      api_key: os.environ/OPENAI_API_KEY

guardrails:
  - guardrail_name: "custom-pre-guard"
    litellm_params:
      guardrail: custom_guardrail.myCustomGuardrail  # 👈 Key change
      mode: "pre_call"                  # runs async_pre_call_hook
  - guardrail_name: "custom-during-guard"
    litellm_params:
      guardrail: custom_guardrail.myCustomGuardrail  
      mode: "during_call"               # runs async_moderation_hook
  - guardrail_name: "custom-post-guard"
    litellm_params:
      guardrail: custom_guardrail.myCustomGuardrail
      mode: "post_call"                 # runs async_post_call_success_hook

3. Starten Sie das LiteLLM Gateway

Docker-Ausführung
litellm pip

Binden Sie Ihr custom_guardrail.py in den LiteLLM Docker-Container ein

Dies bindet Ihre custom_guardrail.py-Datei aus Ihrem lokalen Verzeichnis in das /app-Verzeichnis im Docker-Container ein, sodass sie für das LiteLLM Gateway zugänglich ist.

docker run -d \
  -p 4000:4000 \
  -e OPENAI_API_KEY=$OPENAI_API_KEY \
  --name my-app \
  -v $(pwd)/my_config.yaml:/app/config.yaml \
  -v $(pwd)/custom_guardrail.py:/app/custom_guardrail.py \
  my-app:latest \
  --config /app/config.yaml \
  --port 4000 \
  --detailed_debug \

litellm --config config.yaml --detailed_debug

4. Testen Sie es

Testen Sie `"custom-pre-guard"`

Langchain, OpenAI SDK Anwendungsbeispiele

Eingabe ändern
Erfolgreicher Aufruf

Erwarten Sie, dass das Wort litellm maskiert wird, bevor die Anfrage an die LLM-API gesendet wird. Dies führt den async_pre_call_hook aus

curl -i  -X POST https://:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-1234" \
-d '{
    "model": "gpt-4",
    "messages": [
        {
            "role": "user",
            "content": "say the word - `litellm`"
        }
    ],
   "guardrails": ["custom-pre-guard"]
}'

Erwartete Antwort nach Pre-Guard

{
  "id": "chatcmpl-9zREDkBIG20RJB4pMlyutmi1hXQWc",
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "It looks like you've chosen a string of asterisks. This could be a way to censor or hide certain text. However, without more context, I can't provide a specific word or phrase. If there's something specific you'd like me to say or if you need help with a topic, feel free to let me know!",
        "role": "assistant",
        "tool_calls": null,
        "function_call": null
      }
    }
  ],
  "created": 1724429701,
  "model": "gpt-4o-2024-05-13",
  "object": "chat.completion",
  "system_fingerprint": "fp_3aa7262c27",
  "usage": {
    "completion_tokens": 65,
    "prompt_tokens": 14,
    "total_tokens": 79
  },
  "service_tier": null
}

curl -i https://:4000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-npnwjPQciVRok5yNZgKmFQ" \
  -d '{
    "model": "gpt-3.5-turbo",
    "messages": [
      {"role": "user", "content": "hi what is the weather"}
    ],
    "guardrails": ["custom-pre-guard"]
  }'

Testen Sie `"custom-during-guard"`

Langchain, OpenAI SDK Anwendungsbeispiele

Fehlgeschlagener Aufruf
Erfolgreicher Aufruf

Erwarten Sie, dass dies fehlschlägt, da litellm im Nachrichteninhalt enthalten ist. Dies führt den async_moderation_hook aus

curl -i  -X POST https://:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-1234" \
-d '{
    "model": "gpt-4",
    "messages": [
        {
            "role": "user",
            "content": "say the word - `litellm`"
        }
    ],
   "guardrails": ["custom-during-guard"]
}'

Erwartete Antwort nach Ausführung des During-Guards

{
  "error": {
    "message": "Guardrail failed words - `litellm` detected",
    "type": "None",
    "param": "None",
    "code": "500"
  }
}

curl -i https://:4000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-npnwjPQciVRok5yNZgKmFQ" \
  -d '{
    "model": "gpt-3.5-turbo",
    "messages": [
      {"role": "user", "content": "hi what is the weather"}
    ],
    "guardrails": ["custom-during-guard"]
  }'

Testen Sie `"custom-post-guard"`

Langchain, OpenAI SDK Anwendungsbeispiele

Fehlgeschlagener Aufruf
Erfolgreicher Aufruf

Erwarten Sie, dass dies fehlschlägt, da coffee im Antwortinhalt enthalten sein wird. Dies führt den async_post_call_success_hook aus

curl -i  -X POST https://:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-1234" \
-d '{
    "model": "gpt-4",
    "messages": [
        {
            "role": "user",
            "content": "what is coffee"
        }
    ],
   "guardrails": ["custom-post-guard"]
}'

Erwartete Antwort nach Ausführung des During-Guards

{
  "error": {
    "message": "Guardrail failed Coffee Detected",
    "type": "None",
    "param": "None",
    "code": "500"
  }
}

 curl -i  -X POST https://:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-1234" \
-d '{
    "model": "gpt-4",
    "messages": [
        {
            "role": "user",
            "content": "what is tea"
        }
    ],
   "guardrails": ["custom-post-guard"]
}'

✨ Zusätzliche Parameter an Guardrail übergeben

Info

✨ Dies ist eine exklusive Enterprise-Funktion Kontaktieren Sie uns für eine kostenlose Testversion

Verwenden Sie dies, um zusätzliche Parameter an den Guardrail-API-Aufruf zu übergeben. z. B. Dinge wie Erfolgsschwellenwerte

Verwenden Sie get_guardrail_dynamic_request_body_params

get_guardrail_dynamic_request_body_params ist eine Methode der Klasse litellm.integrations.custom_guardrail.CustomGuardrail, die die dynamischen Guardrail-Parameter abruft, die in der Anfragebody übergeben werden.

from typing import Any, Dict, List, Literal, Optional, Union
import litellm
from litellm._logging import verbose_proxy_logger
from litellm.caching.caching import DualCache
from litellm.integrations.custom_guardrail import CustomGuardrail
from litellm.proxy._types import UserAPIKeyAuth

class myCustomGuardrail(CustomGuardrail):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)

    async def async_pre_call_hook(
        self,
        user_api_key_dict: UserAPIKeyAuth,
        cache: DualCache,
        data: dict,
        call_type: Literal[
            "completion",
            "text_completion",
            "embeddings",
            "image_generation",
            "moderation",
            "audio_transcription",
            "pass_through_endpoint",
            "rerank"
        ],
    ) -> Optional[Union[Exception, str, dict]]:
        # Get dynamic params from request body
        params = self.get_guardrail_dynamic_request_body_params(request_data=data)
        # params will contain: {"success_threshold": 0.9}
        verbose_proxy_logger.debug("Guardrail params: %s", params)
        return data

Parameter in Ihren API-Anfragen übergeben

LiteLLM Proxy ermöglicht es Ihnen, guardrails im Anfragebody zu übergeben, gemäß der guardrails Spezifikation.

OpenAI Python
Curl

import openai
client = openai.OpenAI(
    api_key="anything",
    base_url="http://0.0.0.0:4000"
)

response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "user", "content": "Write a short poem"}],
    extra_body={
        "guardrails": [
            "custom-pre-guard": {
                "extra_body": {
                    "success_threshold": 0.9
                }
            }
        ]
    }
)

curl 'http://0.0.0.0:4000/chat/completions' \
    -H 'Content-Type: application/json' \
    -d '{
    "model": "gpt-3.5-turbo",
    "messages": [
        {
            "role": "user",
            "content": "Write a short poem"
        }
    ],
    "guardrails": [
        "custom-pre-guard": {
            "extra_body": {
                "success_threshold": 0.9
            }
        }
    ]
}'

Die Methode get_guardrail_dynamic_request_body_params gibt zurück

{
    "success_threshold": 0.9
}

CustomGuardrail-Methoden

Komponente	Beschreibung	Optional	Überprüfte Daten	Kann Eingabe ändern	Kann Ausgabe ändern	Kann Anruf fehlschlagen
`async_pre_call_hook`	Ein Hook, der vor dem LLM-API-Aufruf ausgeführt wird	✅	INPUT	✅	❌	✅
`async_moderation_hook`	Ein Hook, der während des LLM-API-Aufrufs ausgeführt wird	✅	INPUT	❌	❌	✅
`async_post_call_success_hook`	Ein Hook, der nach einem erfolgreichen LLM-API-Aufruf ausgeführt wird	✅	INPUT, OUTPUT	❌	✅	✅

Benutzerdefinierter Guardrail

Schnellstart​

1. Schreiben Sie eine CustomGuardrail Klasse​

2. Übergeben Sie Ihre benutzerdefinierte Guardrail-Klasse in der LiteLLM config.yaml​

3. Starten Sie das LiteLLM Gateway​

4. Testen Sie es​

Testen Sie "custom-pre-guard"​

Testen Sie "custom-during-guard"​

Testen Sie "custom-post-guard"​

✨ Zusätzliche Parameter an Guardrail übergeben​

CustomGuardrail-Methoden​

Schnellstart

1. Schreiben Sie eine `CustomGuardrail` Klasse

2. Übergeben Sie Ihre benutzerdefinierte Guardrail-Klasse in der LiteLLM `config.yaml`

3. Starten Sie das LiteLLM Gateway

4. Testen Sie es

Testen Sie `"custom-pre-guard"`

Testen Sie `"custom-during-guard"`

Testen Sie `"custom-post-guard"`

✨ Zusätzliche Parameter an Guardrail übergeben

CustomGuardrail-Methoden