Installation Requirements
This page outlines the software and networking requirements for nodes running GPUStack.
Operating System Requirements
GPUStack supports most modern Linux distributions on AMD64 and ARM64 architectures.
Note
- GPUStack is not recommended for direct installation via PyPi. For best compatibility, use the provided Docker images.
- The Network Time Protocol (NTP) package must be installed to ensure consistent state synchronization between nodes.
GPUStack has been tested and verified on the following operating systems:
| OS | Versions |
|---|---|
| Ubuntu | >= 20.04 |
| Debian | >= 11 |
| RHEL | >= 8 |
| Rocky | >= 8 |
| Fedora | >= 36 |
| OpenSUSE | >= 15.3 (Leap) |
| OpenEuler | >= 22.03 |
Accelerator Runtime Requirements
GPUStack supports a variety of General-Purpose Accelerators as inference backends, including:
- NVIDIA GPU
- AMD GPU
- Ascend NPU
- Hygon DCU (Experimental)
- MThreads GPU (Experimental)
- Iluvatar GPU (Experimental)
- MetaX GPU (Experimental)
- Cambricon MLU (Experimental)
Ensure all required drivers and toolkits are installed before running GPUStack.
NVIDIA GPU
To use NVIDIA GPU, install:
AMD GPU
To use AMD GPU, install:
Ascend NPU
For Ascend NPU, install:
Hygon DCU
To use Hygon DCU, install:
MThreads GPU
To use MThreads GPU, install:
Iluvatar GPU
To use Iluvatar GPU, install:
MetaX GPU
To use MetaX GPU, install:
Cambricon MLU
To use Cambricon MLU, install:
- Cambricon MLU Driver
- Cambricon NeuWare Toolkit
Networking Requirements
Connectivity Requirements
The following network connectivity is required for GPUStack to function properly:
Server-to-Worker: The server must be able to reach workers to proxy inference requests.
Worker-to-Server: Workers must be able to reach the server to register and send updates.
Worker-to-Worker: Required for distributed inference across multiple workers.
Port Requirements
GPUStack uses these ports for communication:
Server Ports
| Port | Description |
|---|---|
| TCP 80 | Default port for GPUStack UI and API endpoints |
| TCP 443 | Default port for GPUStack UI and API endpoints (TLS enabled) |
| TCP 10161 | Default port for server metrics endpoint |
| TCP 8080 | Default port for GPUStack server internal API |
| TCP 5432 | Default port for embedded Postgres Database |
Worker Ports
| Port | Description |
|---|---|
| TCP 10150 | Default port for GPUStack worker |
| TCP 10151 | Default port for worker metrics endpoint |
| TCP 8080 | Default port for GPUStack worker internal API |
| TCP 40000-40063 | Port range for inference services |
| TCP 41000-41999 | Port range for Ray services(vLLM distributed deployment using) |
Embedded Gateway Ports
The embedded gateway for both server and worker uses the following ports for internal communications.
| Port | Host | Description |
|---|---|---|
| TCP 18443 | 127.0.0.1 | Port for the file-based APIServer serving via HTTPS |
| TCP 15000 | 127.0.0.1 | Management port for the Envoy gateway |
| TCP 15021 | 0.0.0.0 | Health check port for the Envoy gateway |
| TCP 15090 | 0.0.0.0 | Metrics port for the Envoy gateway |
| TCP 9876 | 127.0.0.1 | Introspection port for the Pilot-discovery |
| TCP 15010 | 127.0.0.1 | Port for Pilot-discovery serving XDS via HTTP/gRPC |
| TCP 15012 | 127.0.0.1 | Port for Pilot-discovery serving XDS via secure gRPC |
| TCP 15020 | 0.0.0.0 | Metrics port for Pilot-agent |
| TCP 8888 | 127.0.0.1 | Port for Controller serving XDS via HTTP |
| TCP 15051 | 127.0.0.1 | Port for Controller serving XDS via gRPC |
Distributed vLLM with Ray Ports
When using distributed vLLM, GPUStack will parse the above port range for Ray services, and assign them in order as below:
- GCS server port (the first port of the range)
- Client Server port
- Dashboard port
- Dashboard gRPC port (no longer used since Ray 2.45.0, kept for backward compatibility)
- Dashboard agent gRPC port
- Dashboard agent listen port
- Metrics export port
- Node Manager port
- Object Manager port
- Raylet runtime env agent port
- Minimum port number for the worker
- Maximum port number for the worker (the last port of the range)
For more details on Ray ports, see the Ray documentation.