Skip to main content

Nife LiteLLM

Docker Image Python License: MIT GitHub Repo

A Model Context Protocol (MCP) server for multi-provider LLM API access via LiteLLM

InstallationQuick StartConfigurationAPI ReferenceDocumentationGitHub


Overview

The Nife LiteLLM MCP Server provides a standardized Model Context Protocol interface to the Nife LiteLLM API, enabling seamless integration with Claude Desktop and other MCP-compatible clients. It offers unified access to multiple LLM providers including OpenAI, Anthropic, Google, Mistral, Cohere, DeepSeek, and more.

Key Features

  • 🔄 Multi-Provider Support – OpenAI, Anthropic, Google, Mistral, Cohere, Together AI, DeepSeek
  • 🧠 Auto-Detection – Automatic provider routing from model identifiers
  • 📦 Batch Processing – Handle multiple prompts in a single request
  • 🛡️ Secure – Bearer token authentication, non-root container execution
  • Fast – ~2-3s startup time, minimal resource footprint
  • 🏥 Production-Ready – Health checks, structured logging, graceful error handling
  • 🐳 Containerized – Full Docker support with Docker Compose
  • 📡 REST API – Clean, documented HTTP endpoints
  • 🔐 Error Resilience – Partial success handling with 207 status codes

Supported Providers

ProviderModelsStatus
OpenAIGPT-4, GPT-4-turbo, GPT-3.5-turbo, GPT-4o✅ Supported
AnthropicClaude 3 Opus, Sonnet, Haiku✅ Supported
GoogleGemini Pro, 1.5 Pro, 1.5 Flash✅ Supported
MistralLarge, Medium, Small✅ Supported
CohereCommand, Command-R✅ Supported
Together AILlama models, Meta LLaMA 3✅ Supported
DeepSeekDeepSeek-Chat, DeepSeek-Coder✅ Supported
GroqMixtral, LLaMA 2✅ Supported

Installation

# Clone the repository
git clone https://github.com/nifetency/nife-litellm.git
cd nife-litellm

# Start the API
docker-compose up -d

# Verify health
curl http://localhost:8080/health

Option 2: Docker

# Build the image
docker build -t nife-llmlite .

# Run container
docker run -d \
-p 8080:8080 \
--name nife-llmlite \
nife-llmlite

# Check logs
docker logs -f nife-llmlite

Option 3: Local Development

# Install dependencies
pip install -r requirements.txt

# Run application
python app.py

# Or with gunicorn
gunicorn --bind 0.0.0.0:8080 --workers 4 app:app

Quick Start

1. Start the Service

docker-compose up -d

2. Health Check

curl http://localhost:8080/health

Response:

{
"status": "healthy",
"timestamp": "2024-12-16T10:30:00.000000"
}

3. Test a Completion

curl -X POST http://localhost:8080/api/completion \
-H "Content-Type: application/json" \
-d '{
"model_id": "gpt-4",
"api_key": "sk-YOUR_OPENAI_KEY",
"prompts": ["What is artificial intelligence?"],
"temperature": 0.7,
"max_tokens": 500
}'

4. Batch Processing

curl -X POST http://localhost:8080/api/completion \
-H "Content-Type: application/json" \
-d '{
"model_id": "claude-3-sonnet",
"api_key": "sk-ant-YOUR_ANTHROPIC_KEY",
"prompts": [
"Explain quantum computing",
"What is machine learning?",
"Define artificial intelligence"
],
"temperature": 0.5,
"max_tokens": 1000
}'

Configuration

The Nife LiteLLM MCP Server can be configured using environment variables.

Environment Variables

VariableDescriptionDefault
PORTThe port the server will listen on8080
LOG_LEVELLogging level (DEBUG, INFO, WARNING, ERROR)INFO

API Reference

Endpoints

Root Endpoint

GET /

Health Check

GET /health

Completion (Main)

POST /api/completion

List Models

GET /api/models

Deployment

Kubernetes Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
name: nife-llmlite
spec:
replicas: 3
selector:
matchLabels:
app: nife-llmlite
template:
metadata:
labels:
app: nife-llmlite
spec:
containers:
- name: llmlite
image: nife-llmlite:latest
ports:
- containerPort: 8080
env:
- name: PORT
value: "8080"
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 10
periodSeconds: 30
resources:
requests:
cpu: "0.5"
memory: "256Mi"
limits:
cpu: "1"
memory: "512Mi"

Support


Made with ❤️ by the Nife team

WebsiteDocumentationBlog