Installation Requirements
This page outlines the software and networking requirements for nodes running GPUStack.
Operating System Requirements
GPUStack supports most modern Linux distributions on AMD64 and ARM64 architectures.
Note
- GPUStack is not recommended for direct installation via PyPi. For best compatibility, use the provided Docker images.
- The Network Time Protocol (NTP) package must be installed to ensure consistent state synchronization between nodes.
Accelerator Runtime Requirements
GPUStack supports a variety of General-Purpose Accelerators as inference backends, including:
- NVIDIA GPU
- AMD GPU
- Ascend NPU
- Hygon DCU
- MThreads GPU (Experimental)
- Iluvatar GPU (Experimental)
- MetaX GPU (Experimental)
- Cambricon MLU (Experimental)
Ensure all required drivers and toolkits are installed before running GPUStack.
NVIDIA GPU
Requirements
- NVIDIA GPU Driver that supports NVIDIA CUDA 12.4 or higher.
- NVIDIA Container Toolkit
Run the following commands to verify:
sudo nvidia-smi
# If using Docker
sudo docker info 2>/dev/null | grep -q "nvidia" \
&& echo "NVIDIA Container Toolkit OK" \
|| (echo "NVIDIA Container Toolkit not configured"; exit 1)
Supported Inference Backends
AMD GPU
Requirements
- AMD GPU Driver that supports AMD ROCm 6.4 or higher.
- AMD Container Runtime
Run the following commands to verify:
sudo amd-smi static
# If using Docker
sudo docker info 2>/dev/null | grep -q "amd" \
&& echo "AMD Container Toolkit OK" \
|| (echo "AMD Container Toolkit not configured"; exit 1)
Supported Inference Backends
- vLLM
- Custom
Ascend NPU
Requirements
Run the following commands to verify:
sudo npu-smi info
# If using Docker
sudo docker info 2>/dev/null | grep -q "ascend" \
&& echo "Ascend Container Toolkit OK" \
|| (echo "Ascend Container Toolkit not configured"; exit 1)
Supported Devices
- Ascend NPU 910C series
- Ascend NPU 910B series (910B1 ~ 910B4)
- Ascend NPU 310P3
Supported Inference Backends
Hygon DCU
Requirements
Run the following commands to verify:
sudo hy-smi
Supported Devices
- Hygon DCUs (K100_AI (Verified), Z100/Z100L/K100(Not Verified))
Supported Inference Backends
- vLLM
- Custom
MThreads GPU
Requirements
Run the following commands to verify:
sudo mthreads-gmi
# If using Docker
sudo docker info 2>/dev/null | grep -q "mthreads" \
&& echo "MThreads Container Toolkit OK" \
|| (echo "MThreads Container Toolkit not configured"; exit 1)
Supported Inference Backends
- Custom
Iluvatar GPU
Requirements
Run the following commands to verify:
sudo ixsmi
# If using Docker
sudo docker info 2>/dev/null | grep -q "iluvatar" \
&& echo "Iluvatar Container Toolkit OK" \
|| (echo "Iluvatar Container Toolkit not configured"; exit 1)
Supported Inference Backends
- vLLM
- Custom
MetaX GPU
Requirements
Run the following commands to verify:
sudo mx-smi
Supported Inference Backends
- Custom
Cambricon MLU
Requirements
- Cambricon MLU Driver
- Cambricon NeuWare Toolkit
Run the following commands to verify:
sudo cnmon
Supported Inference Backends
- Custom
Networking Requirements
Connectivity Requirements
The following network connectivity is required for GPUStack to function properly:
Server-to-Worker: The server must be able to reach workers to proxy inference requests.
Worker-to-Server: Workers must be able to reach the server to register and send updates.
Worker-to-Worker: Required for distributed inference across multiple workers.
Port Requirements
GPUStack uses these ports for communication:
Server Ports
| Port | Description |
|---|---|
| TCP 80 | Default port for GPUStack UI and API endpoints |
| TCP 443 | Default port for GPUStack UI and API endpoints (TLS enabled) |
| TCP 10161 | Default port for server metrics endpoint |
| TCP 8080 | Default port for GPUStack server internal API |
| TCP 5432 | Default port for embedded Postgres Database |
Worker Ports
| Port | Description |
|---|---|
| TCP 10150 | Default port for GPUStack worker |
| TCP 10151 | Default port for worker metrics endpoint |
| TCP 8080 | Default port for GPUStack worker internal API |
| TCP 40000-40063 | Port range for inference services |
| TCP 41000-41999 | Port range for Ray services(vLLM distributed deployment using) |
Distributed vLLM with Ray Ports
When using distributed vLLM, GPUStack will parse the above port range for Ray services, and assign them in order as below:
- GCS server port (the first port of the range)
- Client Server port
- Dashboard port
- Dashboard gRPC port (no longer used since Ray 2.45.0, kept for backward compatibility)
- Dashboard agent gRPC port
- Dashboard agent listen port
- Metrics export port
- Node Manager port
- Object Manager port
- Raylet runtime env agent port
- Minimum port number for the worker
- Maximum port number for the worker (the last port of the range)
For more details on Ray ports, see the Ray documentation.
Embedded Gateway Ports
The embedded gateway for both server and worker uses the following ports for internal communications.
| Port | Host | Description |
|---|---|---|
| TCP 18443 | 127.0.0.1 | Port for the file-based APIServer serving via HTTPS |
| TCP 15000 | 127.0.0.1 | Management port for the Envoy gateway |
| TCP 15021 | 0.0.0.0 | Health check port for the Envoy gateway |
| TCP 15090 | 0.0.0.0 | Metrics port for the Envoy gateway |
| TCP 9876 | 127.0.0.1 | Introspection port for the Pilot-discovery |
| TCP 15010 | 127.0.0.1 | Port for Pilot-discovery serving XDS via HTTP/gRPC |
| TCP 15012 | 127.0.0.1 | Port for Pilot-discovery serving XDS via secure gRPC |
| TCP 15020 | 0.0.0.0 | Metrics port for Pilot-agent |
| TCP 8888 | 127.0.0.1 | Port for Controller serving XDS via HTTP |
| TCP 15051 | 127.0.0.1 | Port for Controller serving XDS via gRPC |