Installation Requirements

This page outlines the software and networking requirements for nodes running GPUStack.

Operating System Requirements

GPUStack supports most modern Linux distributions on AMD64 and ARM64 architectures.

Note

GPUStack is not supported for direct installation via PyPi. For best compatibility, use the provided Docker images.
The Network Time Protocol (NTP) package must be installed to ensure consistent state synchronization between nodes.

Accelerator Runtime Requirements

GPUStack supports a variety of General-Purpose Accelerators as inference backends, including:

Ensure all required drivers and toolkits are installed before running GPUStack.

NVIDIA GPU

Requirements

NVIDIA GPU Driver that supports NVIDIA CUDA 12.6 or higher.
NVIDIA Container Toolkit

Run the following commands to verify:

sudo nvidia-smi

# If using Docker
sudo docker info 2>/dev/null | grep -q "nvidia" \
    && echo "NVIDIA Container Toolkit OK" \
    || (echo "NVIDIA Container Toolkit not configured"; exit 1)

Supported Inference Backends

vLLM
SGLang
VoxBox
Custom

AMD GPU

Requirements

AMD GPU Driver that supports AMD ROCm 6.4 or higher.
AMD Container Runtime

Run the following commands to verify:

sudo amd-smi static

# If using Docker
sudo docker info 2>/dev/null | grep -q "amd" \
    && echo "AMD Container Toolkit OK" \
    || (echo "AMD Container Toolkit not configured"; exit 1)

Supported Inference Backends

vLLM
SGLang (requires gfx9 series GPUs)
Custom

Ascend NPU

Requirements

Run the following commands to verify:

sudo npu-smi info

# If using Docker
sudo docker info 2>/dev/null | grep -q "ascend" \
    && echo "Ascend Container Toolkit OK" \
    || (echo "Ascend Container Toolkit not configured"; exit 1)

Supported Devices

Ascend NPU 910C series
Ascend NPU 910B series (910B1 ~ 910B4)
Ascend NPU 310P3

Supported Inference Backends

vLLM
SGLang
MindIE
Custom

Hygon DCU

Requirements

Run the following commands to verify:

sudo hy-smi

Supported Devices

Hygon DCUs

Supported Inference Backends

vLLM
Custom

MThreads GPU

Requirements

Run the following commands to verify:

sudo mthreads-gmi

# If using Docker
sudo docker info 2>/dev/null | grep -q "mthreads" \
    && echo "MThreads Container Toolkit OK" \
    || (echo "MThreads Container Toolkit not configured"; exit 1)

Supported Inference Backends

Custom

Iluvatar GPU

Requirements

Run the following commands to verify:

sudo ixsmi

# If using Docker
sudo docker info 2>/dev/null | grep -q "iluvatar" \
    && echo "Iluvatar Container Toolkit OK" \
    || (echo "Iluvatar Container Toolkit not configured"; exit 1)

Supported Inference Backends

vLLM
Custom

MetaX GPU

Requirements

Run the following commands to verify:

sudo mx-smi

Supported Inference Backends

Custom

Cambricon MLU

Requirements

Cambricon MLU Driver
Cambricon NeuWare Toolkit

Run the following commands to verify:

sudo cnmon

Supported Inference Backends

Custom

T-Head PPU

Requirements

T-Head PPU Driver
T-Head PPU SDK

Run the following commands to verify:

sudo ppu-smi

Supported Inference Backends

vLLM
SGLang
Custom

Networking Requirements

Connectivity Requirements

The following network connectivity is required for GPUStack to function properly:

Server-to-Worker: The server must be able to reach workers to proxy inference requests.

Worker-to-Server: Workers must be able to reach the server to register and send updates.

Worker-to-Worker: Required for distributed inference across multiple workers.

Port Requirements

GPUStack uses these ports for communication:

Server Ports

Port	Description
TCP 80	Default port for GPUStack UI and API endpoints
TCP 443	Default port for GPUStack UI and API endpoints (TLS enabled)
TCP 10161	Default port for server metrics endpoint
TCP 30080	Default port for GPUStack server internal API
TCP 5432	Default port for embedded Postgres Database

Worker Ports

Port	Description
TCP 10150	Default port for GPUStack worker
TCP 10151	Default port for worker metrics endpoint
TCP 40000-40063	Port range for inference services
TCP 41000-41999	Port range for Ray services(vLLM distributed deployment using)

Distributed vLLM with Ray Ports

When using distributed vLLM, GPUStack will parse the above port range for Ray services, and assign them in order as below:

GCS server port (the first port of the range)
Client Server port (reserved for compatibility, not used anymore, see https://github.com/gpustack/gpustack/issues/4171)
Dashboard port
Dashboard gRPC port (no longer used since Ray 2.45.0, kept for backward compatibility)
Dashboard agent gRPC port
Dashboard agent listen port
Metrics export port
Node Manager port
Object Manager port
Raylet runtime env agent port
Minimum port number for the worker
Maximum port number for the worker (the last port of the range)

For more details on Ray ports, see the Ray documentation.

Embedded Gateway Ports

The embedded gateway for both server and worker uses the following ports for internal communications.

Port	Host	Description
TCP 18443	127.0.0.1	Port for the file-based APIServer serving via HTTPS
TCP 15000	127.0.0.1	Management port for the Envoy gateway
TCP 15021	0.0.0.0	Health check port for the Envoy gateway
TCP 15090	0.0.0.0	Metrics port for the Envoy gateway
TCP 9876	127.0.0.1	Introspection port for the Pilot-discovery
TCP 15010	127.0.0.1	Port for Pilot-discovery serving XDS via HTTP/gRPC
TCP 15012	127.0.0.1	Port for Pilot-discovery serving XDS via secure gRPC
TCP 15020	0.0.0.0	Metrics port for Pilot-agent
TCP 8888	127.0.0.1	Port for Controller serving XDS via HTTP
TCP 15051	127.0.0.1	Port for Controller serving XDS via gRPC