Air-Gapped Installation

You can install GPUStack in an air-gapped environment. An air-gapped environment refers to a setup where GPUStack will be installed offline.

The following methods are available for installing GPUStack in an air-gapped environment:

OS	Arch	Supported methods
Linux	AMD64 ARM64	Docker Installation (Recommended) pip Installation
Windows	AMD64	Desktop Installer (Recommended) pip Installation

Supported backends

vLLM (Compute Capability 7.0 and above, only supports Linux AMD64)
llama-box
vox-box

Prerequisites

Port Requirements
CPU support for llama-box backend: AMD64 with AVX, or ARM64 with NEON

LinuxWindows

Check if the CPU is supported:

AMD64ARM64

lscpu | grep avx

grep -E -i "neon|asimd" /proc/cpuinfo

Windows users need to manually verify support for the above instructions.

NVIDIA Driver

Check if the NVIDIA driver is installed:

nvidia-smi --format=csv,noheader --query-gpu=index,name,memory.total,memory.used,utilization.gpu,temperature.gpu

And ensure the driver supports CUDA 12.4 or higher:

LinuxWindows

nvidia-smi | grep "CUDA Version"

nvidia-smi | findstr "CUDA Version"

Docker Installation

Prerequisites

Check if Docker and NVIDIA Container Toolkit are installed:

docker info | grep Runtimes | grep nvidia

Disabling Systemd Cgroup Management in Docker

Note

When systemd is used to manage the cgroups of the container and it is triggered to reload any Unit files that have references to NVIDIA GPUs (e.g. systemctl daemon-reload), containerized GPU workloads may suddenly lose access to their GPUs.

In GPUStack, GPUs may be lost in the Resources menu, and running nvidia-smi inside the GPUStack container may result in the error: Failed to initialize NVML: Unknown Error

To prevent this issue, disabling systemd cgroup management in Docker is required.

Set the parameter "exec-opts": ["native.cgroupdriver=cgroupfs"] in the /etc/docker/daemon.json file and restart docker, such as:

vim /etc/docker/daemon.json

{
  "runtimes": {
    "nvidia": {
      "args": [],
      "path": "nvidia-container-runtime"
    }
  },
  "exec-opts": ["native.cgroupdriver=cgroupfs"]
}

systemctl daemon-reload && systemctl restart docker

Run GPUStack

When running GPUStack with Docker, it works out of the box in an air-gapped environment as long as the Docker images are available. To do this, follow these steps:

Pull GPUStack docker image in an online environment:

docker pull gpustack/gpustack

If you’re using the Blackwell series or the GeForce RTX 50 series, or if your NVIDIA driver supports CUDA 12.8 (you can verify this with nvidia-smi | grep "CUDA Version"), we strongly recommend using the latest-cuda12.8 image:

docker pull gpustack/gpustack:latest-cuda12.8

If your online environment differs from the air-gapped environment in terms of OS or arch, specify the OS and arch of the air-gapped environment when pulling the image:

docker pull --platform linux/amd64 gpustack/gpustack

Publish docker image to a private registry or load it directly in the air-gapped environment.
Refer to the Docker Installation guide to run GPUStack using Docker.

pip Installation

Prerequisites

Python 3.10 ~ 3.12

Check the Python version:

python -V

NVIDIA CUDA Toolkit 12

Check if CUDA is installed and verify that its version is at least 12.4:

nvcc -V

NVIDIA cuDNN 9 (Optional, required for audio models)

Check if cuDNN 9 is installed:

LinuxWindows

ldconfig -p | grep libcudnn

Get-ChildItem -Path C:\ -Recurse -Filter "cudnn*.dll" -ErrorAction SilentlyContinue

Install GPUStack

For manually pip installation, you need to prepare the required packages and tools in an online environment and then transfer them to the air-gapped environment.

Set up an online environment identical to the air-gapped environment, including OS, architecture, and Python version.

Step 1: Download the Required Packages

LinuxWindows

Run the following commands in an online environment:

AMD64ARM64

# Extra dependencies options are "vllm", "audio" and "all"
# "vllm" is only available for Linux AMD64
PACKAGE_SPEC="gpustack[all]"
# To install a specific version
# PACKAGE_SPEC="gpustack[all]==0.6.0"

PACKAGE_SPEC="gpustack[audio]"
# To install a specific version
# PACKAGE_SPEC="gpustack[audio]==0.6.0"

If you don’t need the vLLM backend and support for audio models, just set:

PACKAGE_SPEC="gpustack"

Run the following commands in an online environment:

$PACKAGE_SPEC = "gpustack[audio]"
# To install a specific version
# $PACKAGE_SPEC = "gpustack[audio]==0.6.0"

If you don’t need support for audio models, just set:

$PACKAGE_SPEC = "gpustack"

Download all required packages:

pip wheel $PACKAGE_SPEC -w gpustack_offline_packages

Install GPUStack to use its CLI:

pip install gpustack

Download dependency tools and save them as an archive:

gpustack download-tools --save-archive gpustack_offline_tools.tar.gz

If your online environment differs from the air-gapped environment, specify the OS, architecture, and device explicitly:

gpustack download-tools --save-archive gpustack_offline_tools.tar.gz --system linux --arch amd64 --device cuda

Note

This instruction assumes that the online environment uses the same GPU type as the air-gapped environment. If the GPU types differ, use the --device flag to specify the device type for the air-gapped environment. Refer to the download-tools command for more information.

Step 2: Transfer the Packages

Transfer the following files from the online environment to the air-gapped environment.

gpustack_offline_packages directory.
gpustack_offline_tools.tar.gz file.

Step 3: Install GPUStack

In the air-gapped environment, run the following commands.

Install GPUStack from the downloaded packages:

pip install --no-index --find-links=gpustack_offline_packages gpustack

Load and apply the pre-downloaded tools archive:

gpustack download-tools --load-archive gpustack_offline_tools.tar.gz

Now you can run GPUStack by following the instructions in the pip Installation guide.