Skip to content

Air-Gapped Installation

You can install GPUStack in an air-gapped environment. An air-gapped environment refers to a setup where GPUStack will be installed offline.

The following methods are available for installing GPUStack in an air-gapped environment:

OS Arch Supported methods
Linux AMD64
ARM64
Docker Installation (Recommended)
pip Installation
Windows AMD64 pip Installation

Supported backends

  • vLLM (Compute Capability 7.0 and above, only supports Linux AMD64)
  • llama-box
  • vox-box

Prerequisites

  • Port Requirements
  • CPU support for llama-box backend: AMD64 with AVX2, or ARM64 with NEON

Check if the CPU is supported:

lscpu | grep avx2
grep -E -i "neon|asimd" /proc/cpuinfo

Windows users need to manually verify support for the above instructions.

Check if the NVIDIA driver is installed:

nvidia-smi --format=csv,noheader --query-gpu=index,name,memory.total,memory.used,utilization.gpu,temperature.gpu

And ensure the driver supports CUDA 12.4 or higher:

nvidia-smi | grep "CUDA Version"
nvidia-smi | findstr "CUDA Version"

Docker Installation

Prerequisites

Check if Docker and NVIDIA Container Toolkit are installed:

docker info | grep Runtimes | grep nvidia

Note

When systemd is used to manage the cgroups of the container and it is triggered to reload any Unit files that have references to NVIDIA GPUs (e.g. systemctl daemon-reload), containerized GPU workloads may suddenly lose access to their GPUs.

In GPUStack, GPUs may be lost in the Resources menu, and running nvidia-smi inside the GPUStack container may result in the error: Failed to initialize NVML: Unknown Error

To prevent this issue, disabling systemd cgroup management in Docker is required.

Set the parameter "exec-opts": ["native.cgroupdriver=cgroupfs"] in the /etc/docker/daemon.json file and restart docker, such as:

vim /etc/docker/daemon.json
{
  "runtimes": {
    "nvidia": {
      "args": [],
      "path": "nvidia-container-runtime"
    }
  },
  "exec-opts": ["native.cgroupdriver=cgroupfs"]
}
systemctl daemon-reload && systemctl restart docker

Run GPUStack

When running GPUStack with Docker, it works out of the box in an air-gapped environment as long as the Docker images are available. To do this, follow these steps:

  1. Pull GPUStack docker image in an online environment:
docker pull gpustack/gpustack

If your online environment differs from the air-gapped environment in terms of OS or arch, specify the OS and arch of the air-gapped environment when pulling the image:

docker pull --platform linux/amd64 gpustack/gpustack
  1. Publish docker image to a private registry or load it directly in the air-gapped environment.
  2. Refer to the Docker Installation guide to run GPUStack using Docker.

pip Installation

Prerequisites

  • Python 3.10 ~ 3.12

Check the Python version:

python -V

Check if CUDA is installed and verify that its version is at least 12.4:

nvcc -V

Check if cuDNN 9 is installed:

ldconfig -p | grep libcudnn
Get-ChildItem -Path C:\ -Recurse -Filter "cudnn*.dll" -ErrorAction SilentlyContinue

Install GPUStack

For manually pip installation, you need to prepare the required packages and tools in an online environment and then transfer them to the air-gapped environment.

Set up an online environment identical to the air-gapped environment, including OS, architecture, and Python version.

Step 1: Download the Required Packages

Run the following commands in an online environment:

# Extra dependencies options are "vllm", "audio" and "all"
# "vllm" is only available for Linux AMD64
PACKAGE_SPEC="gpustack[all]"
# To install a specific version
# PACKAGE_SPEC="gpustack[all]==0.6.0"
PACKAGE_SPEC="gpustack[audio]"
# To install a specific version
# PACKAGE_SPEC="gpustack[audio]==0.6.0"

If you don’t need the vLLM backend and support for audio models, just set:

PACKAGE_SPEC="gpustack"

Run the following commands in an online environment:

$PACKAGE_SPEC = "gpustack[audio]"
# To install a specific version
# $PACKAGE_SPEC = "gpustack[audio]==0.6.0"

If you don’t need support for audio models, just set:

$PACKAGE_SPEC = "gpustack"

Download all required packages:

pip wheel $PACKAGE_SPEC -w gpustack_offline_packages

Install GPUStack to use its CLI:

pip install gpustack

Download dependency tools and save them as an archive:

gpustack download-tools --save-archive gpustack_offline_tools.tar.gz

If your online environment differs from the air-gapped environment, specify the OS, architecture, and device explicitly:

gpustack download-tools --save-archive gpustack_offline_tools.tar.gz --system linux --arch amd64 --device cuda

Note

This instruction assumes that the online environment uses the same GPU type as the air-gapped environment. If the GPU types differ, use the --device flag to specify the device type for the air-gapped environment. Refer to the download-tools command for more information.

Step 2: Transfer the Packages

Transfer the following files from the online environment to the air-gapped environment.

  • gpustack_offline_packages directory.
  • gpustack_offline_tools.tar.gz file.

Step 3: Install GPUStack

In the air-gapped environment, run the following commands.

Install GPUStack from the downloaded packages:

pip install --no-index --find-links=gpustack_offline_packages gpustack

Load and apply the pre-downloaded tools archive:

gpustack download-tools --load-archive gpustack_offline_tools.tar.gz

Now you can run GPUStack by following the instructions in the pip Installation guide.