📄️ Quick Start
Quick start CLI, Config, Docker
📄️ 🐳 Docker, Deploying LiteLLM Proxy
You can find the Dockerfile to build litellm proxy here
📄️ ⚡ Best Practices for Production
1. Use this config.yaml
🔗 📖 All Endpoints (Swagger)
📄️ 🎉 Demo App
Here is a demo of the proxy. To log in pass in:
📄️ Proxy Config.yaml
Set model list, apibase, apikey, temperature & proxy server settings (master-key) on the config.yaml.
📄️ 🔥 Fallbacks, Retries, Timeouts, Load Balancing
Retry call with multiple instances of the same model.
📄️ 💸 Spend Tracking
Track spend for keys, users, and teams across 100+ LLMs.
📄️ 💰 Budgets, Rate Limits
Requirements:
📄️ 💰 Billing
Bill users for their usage.
📄️ Use with Langchain, OpenAI SDK, LlamaIndex, Curl
Input, Output, Exceptions are mapped to the OpenAI format for all supported models
📄️ ✨ Enterprise Features - Content Mod, SSO
Features here are behind a commercial license in our /enterprise folder. See Code
📄️ 🔑 Virtual Keys
Track Spend, and control model access via virtual keys for the proxy
📄️ 🚨 Alerting
Get alerts for:
🗃️ Logging
2 items
📄️ 👥 Team-based Routing + Logging
Routing
📄️ Region-based Routing
Route specific customers to eu-only models.
📄️ [BETA] Proxy UI
Create + delete keys through a UI
📄️ [BETA] JWT-based Auth
Use JWT's to auth admins / projects into the proxy.
🗃️ Extra Load Balancing
1 items
📄️ Model Management
Add new models + Get model info without restarting proxy.
📄️ Health Checks
Use this to health check all LLMs defined in your config.yaml
📄️ Debugging
2 levels of debugging supported.
📄️ PII Masking
LiteLLM supports Microsoft Presidio for PII masking.
📄️ Prompt Injection
LiteLLM supports similarity checking against a pre-generated list of prompt injection attacks, to identify if a request contains an attack.
📄️ Caching
Cache LLM Responses
📄️ Grafana, Prometheus metrics [BETA]
LiteLLM Exposes a /metrics endpoint for Prometheus to Poll
📄️ Modify / Reject Incoming Requests
- Modify data before making llm api calls on proxy
📄️ Post-Call Rules
Use this to fail a request based on the output of an llm api call.
📄️ CLI Arguments
Cli arguments, --host, --port, --num_workers