Six API keys, six SDKs, six sets of weird edge cases. LiteLLM gives you a single OpenAI-compatible endpoint and routes to any provider — so swapping from GPT to Claude to Llama becomes a one-string change. Here's the setup that actually works in 10 minutes.

LiteLLM in 10 Minutes: One Interface, Every LLM Provider

I have six different LLM API keys in my .env file. OpenAI. Anthropic. Google. Mistral. Groq. Together. Every project I touch has a different one in the lead role — and every provider ships a different SDK, a different response shape, a different streaming behavior, a different tool-calling format. Switching costs in this stack are real.

LiteLLM fixes it. It's an open-source proxy that gives you an OpenAI-compatible endpoint and routes to literally any provider. Set it up once, swap models with a string, and your code never changes.

Here's the part that takes 10 minutes once you know it.

The Core Idea

You run LiteLLM as a local proxy on port 4000. Your app talks to it like it's OpenAI. The proxy translates to whatever provider you've configured. To move from GPT-4o to Claude to Llama, you change the model name in your request — that's it.

bash

# Start the proxy with three models in 30 seconds
litellm \
  --model openai/gpt-4o \
  --model anthropic/claude-sonnet-4-5 \
  --model gemini/gemini-2.0-flash

That's the first step. Three providers, one endpoint, one SDK.

Install and Configure

bash

pip install 'litellm[proxy]'==1.51.0

Create a config.yaml — this is where you centralize everything:

yaml

model_list:
  - model_name: gpt-4o
    litellm_params:
      model: openai/gpt-4o
      api_key: os.environ/OPENAI_API_KEY
  - model_name: claude-sonnet
    litellm_params:
      model: anthropic/claude-sonnet-4-5
      api_key: os.environ/ANTHROPIC_API_KEY
  - model_name: gemini-flash
    litellm_params:
      model: gemini/gemini-2.0-flash
      api_key: os.environ/GEMINI_API_KEY
litellm_settings:
  drop_params: true   # silently ignore params a provider doesn't support
  telemetry: false    # opt out of phoning home

drop_params: true is the one flag that saves you hours. Anthropic doesn't have frequency_penalty? LiteLLM drops it instead of erroring. Always set it.

Point Your Code at the Proxy

python

from openai import OpenAI
client = OpenAI(
    base_url="http://localhost:4000",
    api_key="anything",  # not enforced locally
)
resp = client.chat.completions.create(
    model="claude-sonnet",   # swap to "gpt-4o" with no other change
    messages=[{"role": "user", "content": "Explain KV cache in 2 sentences."}],
)
print(resp.choices[0].message.content)

Your code doesn't know — and doesn't care — which provider answered. Run an A/B test by changing one string. Migrate providers in a PR with a single line of diff.

Patterns That Pay Off Immediately

Fallback chains. When a provider hiccups, LiteLLM retries the next one — no code change required.

yaml

- model_name: reliable
  litellm_params:
    model: openai/gpt-4o
    api_key: os.environ/OPENAI_API_KEY
    fallbacks: ["claude-sonnet", "gemini-flash"]

Cost routing. Cheap model by default, expensive model for hard queries.

python

def route(prompt: str) -> str:
    return "gemini-flash" if len(prompt) < 500 else "claude-sonnet"

Free logging. Every call gets logged to a local SQLite file with cost, latency, and token counts. That's the "log every model call" pattern — built in, no extra setup.

The Mistakes I Made

Don't assume provider-specific params are ignored. They aren't — temperature=0 means subtly different things to different providers, and LiteLLM passes them through verbatim unless drop_params: true is set. And always pass model= explicitly in the request body. If you don't, the proxy picks the first model in your list — rarely the one you wanted.

If you only ever use one provider, skip LiteLLM — the abstraction is overhead you don't earn back. The win shows up at the second provider, or the day a regional outage forces a failover in 30 seconds.

Set it up once on a Monday. By Friday you'll have routed around two outages and A/B tested three models without touching a line of business logic. That's the win.

LiteLLM is the only piece of LLM infrastructure I install in every project before writing code. The day a provider hiccups, you'll thank past-you.