Installation Requirements
This page describes the software and networking requirements for the nodes where GPUStack will be installed.
Python Requirements
GPUStack requires Python version 3.10 to 3.12.
Operating System Requirements
GPUStack is supported on the following operating systems:
- macOS
- Windows
- Linux
GPUStack has been tested and verified to work on the following operating systems:
OS | Versions |
---|---|
Windows | 10, 11 |
macOS | >= 14 |
Ubuntu | >= 20.04 |
Debian | >= 11 |
RHEL | >= 8 |
Rocky | >= 8 |
Fedora | >= 36 |
OpenSUSE | >= 15.3 (leap) |
OpenEuler | >= 22.03 |
Note
The installation of GPUStack worker on a Linux system requires that the GLIBC version be 2.29 or higher. If your system uses a lower GLIBC version, consider using the Docker Installation
method as an alternative.
Use the following command to check the GLIBC version:
ldd --version
Supported Architectures
GPUStack supports both AMD64 and ARM64 architectures, with the following notes:
- On Linux and macOS, when using Python versions below 3.12, ensure that the installed Python distribution corresponds to your system architecture.
- On Windows, please use the AMD64 distribution of Python, as wheel packages for certain dependencies are unavailable for ARM64. If you use tools like
conda
, this will be handled automatically, as conda installs the AMD64 distribution by default.
Accelerator Runtime Requirements
GPUStack supports the following accelerators:
- NVIDIA CUDA (Compute Capability 6.0 and above)
- Apple Metal (M-series chips)
- AMD ROCm
- Ascend CANN
- Hygon DTK
- Moore Threads MUSA
Ensure all necessary drivers and libraries are installed on the system prior to installing GPUStack.
NVIDIA CUDA
To use NVIDIA CUDA as an accelerator, ensure the following components are installed:
- NVIDIA Driver
- NVIDIA CUDA Toolkit 12 (Optional, required for non-Docker installations)
- NVIDIA cuDNN 9 (Optional, required for audio models when not using Docker)
- NVIDIA Container Toolkit (Optional, required for Docker installation)
AMD ROCm
To use AMD ROCm as an accelerator, ensure the following components are installed:
Ascend CANN
For Ascend CANN as an accelerator, ensure the following components are installed:
- Ascend NPU Driver & Firmware
- Ascend CANN Toolkit & Kernels (Optional, required for non-Docker installations)
Hygon DTK
To use Hygon DTK as an accelerator, ensure the following components are installed:
Moore Threads MUSA
To use Moore Threads MUSA as an accelerator, ensure the following components are installed:
- MUSA SDK
- MT Container Toolkits (Optional, required for docker installation)
Networking Requirements
Connectivity Requirements
The following network connectivity is required to ensure GPUStack functions properly:
Server-to-Worker: The server must be able to reach the workers for proxying inference requests.
Worker-to-Server: Workers must be able to reach the server to register themselves and send updates.
Worker-to-Worker: Necessary for distributed inference across multiple workers
Port Requirements
GPUStack uses the following ports for communication:
Server Ports
Port | Description |
---|---|
TCP 80 | Default port for the GPUStack UI and API endpoints |
TCP 443 | Default port for the GPUStack UI and API endpoints (when TLS is enabled) |
The following ports are used on GPUStack server when Ray is enabled for distributed vLLM across workers:
Ray Port | Description |
---|---|
TCP 8265 | Default Port for Ray dashboard |
TCP 40096 | Default port for Ray (GCS server) |
TCP 40097 | Default port for Ray Client Server |
The default ports in GPUStack may differ from Ray’s default ports to simplify port exposure, especially when using Docker. For more information about Ray ports, refer to the Ray documentation.
Worker Ports
Port | Description |
---|---|
TCP 10150 | Default port for the GPUStack worker |
TCP 10151 | Default port for exposing metrics |
TCP 40000-40063 | Port range allocated for inference services |
TCP 40064-40095 | Port range allocated for llama-box RPC servers |
The following ports are used on GPUStack worker when Ray is enabled for distributed vLLM across workers:
Ray Port | Description |
---|---|
TCP 40098 | Default port for Ray node manager |
TCP 40099 | Default port for Ray object manager |
TCP 40100-40131 | Port range for Ray worker processes |