GPUStack Installation via Docker Compose
This guide explains how to deploy GPUStack and observability components (Prometheus, Grafana) using Docker Compose. NVIDIA and Ascend platforms are covered, with notes for other GPU types.
Overview of Services
Services:
- gpustack-server: Central server for scheduling, management, and built-in inference.
- gpustack-worker: (Optional) Distributed inference worker, can run on separate nodes.
- prometheus: Metrics collection.
- grafana: Metrics visualization.
Prerequisites
- Docker Compose installed (guide).
- Required ports available (see requirements).
NVIDIA
Requirements
- NVIDIA GPU driver (CUDA 12.4+), verify with:
nvidia-smi - NVIDIA Container Toolkit, verify with:
sudo docker info | grep nvidia
Deployment
- Server (compose file):
Access UI:
sudo docker compose -f docker-compose.server.nvidia.yaml up -dhttp://your_host_ipGet admin password:sudo docker exec -it gpustack-server cat /var/lib/gpustack/initial_admin_password - Worker (compose file):
- Edit file: set
server-urlandtoken. - Start:
sudo docker compose -f docker-compose.worker.nvidia.yaml up -d
Ascend
Requirements
- Ascend NPU Driver supporting Ascend CANN 8.2 or higher, verify with:
sudo npu-smi info - Ascend Container Toolkit, verify with:
sudo docker info 2>/dev/null | grep -q "ascend" \ && echo "Ascend Container Toolkit OK" \ || (echo "Ascend Container Toolkit not configured"; exit 1)
Deployment
- Device detection (before starting):
export ASCEND_VISIBLE_DEVICES=$(ls /dev/davinci* 2>/dev/null | head -1 | grep -o '[0-9]\+' || echo "0") - Server (compose file):
Access UI:
sudo -E docker compose -f docker-compose.server.ascend.yaml up -dhttp://your_host_ipGet admin password:sudo docker exec -it gpustack-server cat /var/lib/gpustack/initial_admin_password - Worker (compose file):
- Edit file: set
server-urlandtoken. - Start:
sudo -E docker compose -f docker-compose.worker.ascend.yaml up -d
Other GPU Platforms
Refer to requirements for platform-specific setup (AMD, MLU, etc.).
Deployment
- Edit Compose files as needed:
- Adjust/remove
runtime: nvidia. - Set environment variables and volumes for your hardware.
- For workers, set correct
server-urlandtoken. - Start services using
docker compose -f <compose-file> up -d.