Air-Gapped Installation
You can install GPUStack in an air-gapped environment. An air-gapped environment refers to a setup where GPUStack will be installed offline.
The following methods are available for installing GPUStack in an air-gapped environment:
OS | Arch | Supported methods |
---|---|---|
Linux | AMD64 ARM64 |
Docker Installation (Recommended) pip Installation |
Windows | AMD64 | pip Installation |
Supported backends
- vLLM (Compute Capability 7.0 and above, only supports Linux AMD64)
- llama-box
- vox-box
Prerequisites
- Port Requirements
- CPU support for llama-box backend: AMD64 with AVX2, or ARM64 with NEON
Check if the CPU is supported:
lscpu | grep avx2
grep -E -i "neon|asimd" /proc/cpuinfo
Windows users need to manually verify support for the above instructions.
Check if the NVIDIA driver is installed:
nvidia-smi --format=csv,noheader --query-gpu=index,name,memory.total,memory.used,utilization.gpu,temperature.gpu
And ensure the driver supports CUDA 12.4 or higher:
nvidia-smi | grep "CUDA Version"
nvidia-smi | findstr "CUDA Version"
Docker Installation
Prerequisites
Check if Docker and NVIDIA Container Toolkit are installed:
docker info | grep Runtimes | grep nvidia
Note
When systemd is used to manage the cgroups of the container and it is triggered to reload any Unit files that have references to NVIDIA GPUs (e.g. systemctl daemon-reload), containerized GPU workloads may suddenly lose access to their GPUs.
In GPUStack, GPUs may be lost in the Resources menu, and running nvidia-smi
inside the GPUStack container may result in the error: Failed to initialize NVML: Unknown Error
To prevent this issue, disabling systemd cgroup management in Docker is required.
Set the parameter "exec-opts": ["native.cgroupdriver=cgroupfs"] in the /etc/docker/daemon.json
file and restart docker, such as:
vim /etc/docker/daemon.json
{
"runtimes": {
"nvidia": {
"args": [],
"path": "nvidia-container-runtime"
}
},
"exec-opts": ["native.cgroupdriver=cgroupfs"]
}
systemctl daemon-reload && systemctl restart docker
Run GPUStack
When running GPUStack with Docker, it works out of the box in an air-gapped environment as long as the Docker images are available. To do this, follow these steps:
- Pull GPUStack docker image in an online environment:
docker pull gpustack/gpustack
If your online environment differs from the air-gapped environment in terms of OS or arch, specify the OS and arch of the air-gapped environment when pulling the image:
docker pull --platform linux/amd64 gpustack/gpustack
- Publish docker image to a private registry or load it directly in the air-gapped environment.
- Refer to the Docker Installation guide to run GPUStack using Docker.
pip Installation
Prerequisites
- Python 3.10 ~ 3.12
Check the Python version:
python -V
Check if CUDA is installed and verify that its version is at least 12.4:
nvcc -V
- NVIDIA cuDNN 9 (Optional, required for audio models)
Check if cuDNN 9 is installed:
ldconfig -p | grep libcudnn
Get-ChildItem -Path C:\ -Recurse -Filter "cudnn*.dll" -ErrorAction SilentlyContinue
Install GPUStack
For manually pip installation, you need to prepare the required packages and tools in an online environment and then transfer them to the air-gapped environment.
Set up an online environment identical to the air-gapped environment, including OS, architecture, and Python version.
Step 1: Download the Required Packages
Run the following commands in an online environment:
# Extra dependencies options are "vllm", "audio" and "all"
# "vllm" is only available for Linux AMD64
PACKAGE_SPEC="gpustack[all]"
# To install a specific version
# PACKAGE_SPEC="gpustack[all]==0.6.0"
PACKAGE_SPEC="gpustack[audio]"
# To install a specific version
# PACKAGE_SPEC="gpustack[audio]==0.6.0"
If you don’t need the vLLM backend and support for audio models, just set:
PACKAGE_SPEC="gpustack"
Run the following commands in an online environment:
$PACKAGE_SPEC = "gpustack[audio]"
# To install a specific version
# $PACKAGE_SPEC = "gpustack[audio]==0.6.0"
If you don’t need support for audio models, just set:
$PACKAGE_SPEC = "gpustack"
Download all required packages:
pip wheel $PACKAGE_SPEC -w gpustack_offline_packages
Install GPUStack to use its CLI:
pip install gpustack
Download dependency tools and save them as an archive:
gpustack download-tools --save-archive gpustack_offline_tools.tar.gz
If your online environment differs from the air-gapped environment, specify the OS, architecture, and device explicitly:
gpustack download-tools --save-archive gpustack_offline_tools.tar.gz --system linux --arch amd64 --device cuda
Note
This instruction assumes that the online environment uses the same GPU type as the air-gapped environment. If the GPU types differ, use the --device
flag to specify the device type for the air-gapped environment. Refer to the download-tools command for more information.
Step 2: Transfer the Packages
Transfer the following files from the online environment to the air-gapped environment.
gpustack_offline_packages
directory.gpustack_offline_tools.tar.gz
file.
Step 3: Install GPUStack
In the air-gapped environment, run the following commands.
Install GPUStack from the downloaded packages:
pip install --no-index --find-links=gpustack_offline_packages gpustack
Load and apply the pre-downloaded tools archive:
gpustack download-tools --load-archive gpustack_offline_tools.tar.gz
Now you can run GPUStack by following the instructions in the pip Installation guide.