👋Jan API Documentation

OpenAI-compatible API for local and server deployments

Local API

llama.cpp

Run Jan locally with complete privacy.

http://localhost:1337/v1 Privacy-first • GGUF models • CPU/GPU

vLLM

Self-hostable server for high-throughput inference.

http://your-server:8000/v1 Open source • Auto-scaling • Multi-GPU

1 Choose deployment type

2 Start your server

3 Make API requests