Vertex AI SDK

Pass-through-Endpunkte für Vertex AI - rufen Sie anbieterspezifische Endpunkte im nativen Format auf (keine Übersetzung).

Feature	Unterstützt	Notizen
Kostenverfolgung	✅	unterstützt alle Modelle auf dem Endpunkt `/generateContent`
Protokollierung	✅	funktioniert über alle Integrationen hinweg
Endbenutzerverfolgung	❌	Sagen Sie uns Bescheid, wenn Sie dies benötigen
Streaming	✅

Unterstützte Endpunkte

LiteLLM unterstützt 2 Vertex AI Pass-Through-Routen

/vertex_ai → leitet weiter zu https://{vertex_location}-aiplatform.googleapis.com/
/vertex_ai/discovery → leitet weiter zu https://discoveryengine.googleapis.com

Verwendung

Ersetzen Sie einfach https://REGION-aiplatform.googleapis.com durch LITELLM_PROXY_BASE_URL/vertex_ai

LiteLLM unterstützt 3 Flows zum Aufrufen von Vertex AI-Endpunkten über Pass-Through

Spezifische Anmeldeinformationen: Administrator legt Pass-Through-Anmeldeinformationen für ein bestimmtes Projekt/Region fest.
Standardanmeldeinformationen: Administrator legt Standardanmeldeinformationen fest.
Clientseitige Anmeldeinformationen: Benutzer kann clientseitige Anmeldeinformationen an Vertex AI senden (Standardverhalten - wenn keine Standard- oder zugeordneten Anmeldeinformationen gefunden werden, wird die Anfrage direkt weitergeleitet).

Beispielverwendung

Spezifisches Projekt/Region
Standardanmeldeinformationen
Clientanmeldeinformationen

model_list:
  - model_name: gemini-1.0-pro
    litellm_params:
      model: vertex_ai/gemini-1.0-pro
      vertex_project: adroit-crow-413218
      vertex_region: us-central1
      vertex_credentials: /path/to/credentials.json
      use_in_pass_through: true # 👈 KEY CHANGE

In config.yaml festgelegt
In Umgebungsvariablen festgelegt

default_vertex_config: 
  vertex_project: adroit-crow-413218
  vertex_region: us-central1
  vertex_credentials: /path/to/credentials.json

export DEFAULT_VERTEXAI_PROJECT="adroit-crow-413218"
export DEFAULT_VERTEXAI_LOCATION="us-central1"
export DEFAULT_GOOGLE_APPLICATION_CREDENTIALS="/path/to/credentials.json"

Versuchen Sie Gemini 2.0 Flash (curl)

MODEL_ID="gemini-2.0-flash-001"
PROJECT_ID="YOUR_PROJECT_ID"

curl \
  -X POST \
  -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
  -H "Content-Type: application/json" \
  "${LITELLM_PROXY_BASE_URL}/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/${MODEL_ID}:streamGenerateContent" -d \
  $'{
    "contents": {
      "role": "user",
      "parts": [
        {
        "fileData": {
          "mimeType": "image/png",
          "fileUri": "gs://generativeai-downloads/images/scones.jpg"
          }
        },
        {
          "text": "Describe this picture."
        }
      ]
    }
  }'

Beispielverwendung

curl
Vertex Node.js SDK

curl https://:4000/vertex_ai/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/${MODEL_ID}:generateContent \
  -H "Content-Type: application/json" \
  -H "x-litellm-api-key: Bearer sk-1234" \
  -d '{
    "contents":[{
      "role": "user", 
      "parts":[{"text": "How are you doing today?"}]
    }]
  }'

const { VertexAI } = require('@google-cloud/vertexai');

const vertexAI = new VertexAI({
    project: 'your-project-id', // enter your vertex project id
    location: 'us-central1', // enter your vertex region
    apiEndpoint: "localhost:4000/vertex_ai" // <proxy-server-url>/vertex_ai # note, do not include 'https://' in the url
});

const model = vertexAI.getGenerativeModel({
    model: 'gemini-1.0-pro'
}, {
    customHeaders: {
        "x-litellm-api-key": "sk-1234" // Your litellm Virtual Key
    }
});

async function generateContent() {
    try {
        const prompt = {
            contents: [{
                role: 'user',
                parts: [{ text: 'How are you doing today?' }]
            }]
        };

        const response = await model.generateContent(prompt);
        console.log('Response:', response);
    } catch (error) {
        console.error('Error:', error);
    }
}

generateContent();

Schnellstart

Rufen wir den Vertex AI /generateContent Endpunkt auf

Fügen Sie Vertex AI-Anmeldeinformationen zu Ihrer Umgebung hinzu

export DEFAULT_VERTEXAI_PROJECT="" # "adroit-crow-413218"
export DEFAULT_VERTEXAI_LOCATION="" # "us-central1"
export DEFAULT_GOOGLE_APPLICATION_CREDENTIALS="" # "/Users/Downloads/adroit-crow-413218-a956eef1a2a8.json"

LiteLLM Proxy starten

litellm

# RUNNING on http://0.0.0.0:4000

Testen Sie es!

Rufen wir den Token-Zählungs-Endpunkt des Google AI Studio auf

curl https://:4000/vertex-ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/gemini-1.0-pro:generateContent \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-1234" \
  -d '{
    "contents":[{
      "role": "user",
      "parts":[{"text": "How are you doing today?"}]
    }]
  }'

Unterstützte API-Endpunkte

Gemini API
Embeddings API
Imagen API
Code Completion API
Batch-Vorhersage-API
Tuning-API
CountTokens API

Authentifizierung bei Vertex AI

Der LiteLLM Proxy Server unterstützt zwei Methoden zur Authentifizierung bei Vertex AI

Vertex-Anmeldeinformationen clientseitig an den Proxy-Server übergeben
Vertex AI-Anmeldeinformationen auf dem Proxy-Server einstellen

Verwendungsbeispiele

Gemini API (Content generieren)

curl https://:4000/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/gemini-1.5-flash-001:generateContent \
  -H "Content-Type: application/json" \
  -H "x-litellm-api-key: Bearer sk-1234" \
  -d '{"contents":[{"role": "user", "parts":[{"text": "hi"}]}]}'

Embeddings API

curl https://:4000/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/textembedding-gecko@001:predict \
  -H "Content-Type: application/json" \
  -H "x-litellm-api-key: Bearer sk-1234" \
  -d '{"instances":[{"content": "gm"}]}'

Imagen API

curl https://:4000/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/imagen-3.0-generate-001:predict \
  -H "Content-Type: application/json" \
  -H "x-litellm-api-key: Bearer sk-1234" \
  -d '{"instances":[{"prompt": "make an otter"}], "parameters": {"sampleCount": 1}}'

Count Tokens API

curl https://:4000/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/gemini-1.5-flash-001:countTokens \
  -H "Content-Type: application/json" \
  -H "x-litellm-api-key: Bearer sk-1234" \
  -d '{"contents":[{"role": "user", "parts":[{"text": "hi"}]}]}'

Tuning API

Erstellen eines Fine-Tuning-Jobs

curl https://:4000/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/gemini-1.5-flash-001:tuningJobs \
      -H "Content-Type: application/json" \
      -H "x-litellm-api-key: Bearer sk-1234" \
      -d '{
  "baseModel": "gemini-1.0-pro-002",
  "supervisedTuningSpec" : {
      "training_dataset_uri": "gs://cloud-samples-data/ai-platform/generative_ai/sft_train_data.jsonl"
  }
}'

Erweitert

Voraussetzungen

Proxy mit DB einrichten

Verwenden Sie dies, um zu vermeiden, dass Entwickler den rohen Anthropic API-Schlüssel erhalten, ihnen aber dennoch die Verwendung von Anthropic-Endpunkten ermöglichen.

Verwendung mit virtuellen Schlüsseln

Umgebung einrichten

export DATABASE_URL=""
export LITELLM_MASTER_KEY=""

# vertex ai credentials
export DEFAULT_VERTEXAI_PROJECT="" # "adroit-crow-413218"
export DEFAULT_VERTEXAI_LOCATION="" # "us-central1"
export DEFAULT_GOOGLE_APPLICATION_CREDENTIALS="" # "/Users/Downloads/adroit-crow-413218-a956eef1a2a8.json"

litellm

# RUNNING on http://0.0.0.0:4000

Virtuellen Schlüssel generieren

curl -X POST 'http://0.0.0.0:4000/key/generate' \
-H 'x-litellm-api-key: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{}'

Erwartete Antwort

{
    ...
    "key": "sk-1234ewknldferwedojwojw"
}

Testen Sie es!

curl https://:4000/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/gemini-1.0-pro:generateContent \
  -H "Content-Type: application/json" \
  -H "x-litellm-api-key: Bearer sk-1234" \
  -d '{
    "contents":[{
      "role": "user", 
      "parts":[{"text": "How are you doing today?"}]
    }]
  }'

Senden Sie `tags` in den Request-Headern

Verwenden Sie dies, wenn Sie möchten, dass tags in der LiteLLM-Datenbank und bei Logging-Callbacks verfolgt werden

Übergeben Sie tags im Request-Header als kommaseparierte Liste. Im folgenden Beispiel werden die folgenden Tags verfolgt

tags: ["vertex-js-sdk", "pass-through-endpoint"]

curl
Vertex Node.js SDK

curl https://:4000/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/gemini-1.0-pro:generateContent \
  -H "Content-Type: application/json" \
  -H "x-litellm-api-key: Bearer sk-1234" \
  -H "tags: vertex-js-sdk,pass-through-endpoint" \
  -d '{
    "contents":[{
      "role": "user", 
      "parts":[{"text": "How are you doing today?"}]
    }]
  }'

const { VertexAI } = require('@google-cloud/vertexai');

const vertexAI = new VertexAI({
    project: 'your-project-id', // enter your vertex project id
    location: 'us-central1', // enter your vertex region
    apiEndpoint: "localhost:4000/vertex_ai" // <proxy-server-url>/vertex_ai # note, do not include 'https://' in the url
});

const model = vertexAI.getGenerativeModel({
    model: 'gemini-1.0-pro'
}, {
    customHeaders: {
        "x-litellm-api-key": "sk-1234", // Your litellm Virtual Key
        "tags": "vertex-js-sdk,pass-through-endpoint"
    }
});

async function generateContent() {
    try {
        const prompt = {
            contents: [{
                role: 'user',
                parts: [{ text: 'How are you doing today?' }]
            }]
        };

        const response = await model.generateContent(prompt);
        console.log('Response:', response);
    } catch (error) {
        console.error('Error:', error);
    }
}

generateContent();

Vertex AI SDK

Unterstützte Endpunkte​

Verwendung​

Beispielverwendung​

Beispielverwendung​

Schnellstart​

Unterstützte API-Endpunkte​

Authentifizierung bei Vertex AI​

Verwendungsbeispiele​

Gemini API (Content generieren)​

Embeddings API​

Imagen API​

Count Tokens API​

Tuning API​

Erweitert​

Verwendung mit virtuellen Schlüsseln​

Senden Sie tags in den Request-Headern​

Unterstützte Endpunkte

Verwendung

Beispielverwendung

Beispielverwendung

Schnellstart

Unterstützte API-Endpunkte

Authentifizierung bei Vertex AI

Verwendungsbeispiele

Gemini API (Content generieren)

Embeddings API

Imagen API

Count Tokens API

Tuning API

Erweitert

Verwendung mit virtuellen Schlüsseln

Senden Sie `tags` in den Request-Headern