Running Inference With Hygon DCUs
GPUStack supports running inference on Hygon GPUs. This tutorial will guide you through the configuration steps.
Docker Installation
System and Hardware Support
OS | Architecture | Status | Verified |
---|---|---|---|
Linux | x86_64 | Support | Ubuntu 22.04 |
Tools | Verified |
---|---|
Driver | rock-5.7.1-6.2.26-V1.5 |
DTK | DTK-24.04.3 |
Supported Backends | Verified |
---|---|
llama-box | Yes |
vLLM | Yes |
vox-box | Yes |
Setup Instructions
Install Driver and DTK
- Install Required Packages
Register and log in to Hygon Developer Community: https://developer.hpccube.com/tool/#sdk
Select the appropriate installation method for your system. Download the Driver and DTK(DCU Toolkit), install base on the document community provided.
- Verify Installation
# Verify the rocminfo.
# Expected result: Device Type: DCU
rocminfo | grep DCU
# Check if the GPU is listed as an agent.
rocminfo
# Check rocm-smi.
rocm-smi -i --showmeminfo vram --showpower --showserial --showuse --showtemp --showproductname
# Check hy-smi
hy-smi
Configure the Container Runtime
Follow the Docker Installation Guide to install and configure the container runtime.
Installing GPUStack
To set up an isolated environment for GPUStack, we recommend using Docker.
docker run -itd --shm-size 500g \
--network=host --privileged \
--group-add video \
--cap-add=SYS_PTRACE \
--security-opt seccomp=unconfined \
--device=/dev/kfd --device=/dev/dri \
-v /opt/hyhal:/opt/hyhal:ro \
gpustack/gpustack:v0.5.1-dcu
If the following message appears, the GPUStack container is running successfully:
2024-11-15T23:37:46+00:00 - gpustack.server.server - INFO - Serving on 0.0.0.0:80.
2024-11-15T23:37:46+00:00 - gpustack.worker.worker - INFO - Starting GPUStack worker.
Once the container is running, access the GPUStack web interface by navigating to http://localhost:80
in your browser, you should see that GPUStack successfully recognizes the Hygon DCU Device in the resources page.
Running Inference
After installation, you can deploy models and run inference. Refer to the model management for detailed usage instructions.
non-Docker Installation
System and Hardware Support
OS | Architecture | Status | Verified |
---|---|---|---|
Linux | x86_64 | Support | Ubuntu 20.04/22.04 |
Tools | Verified |
---|---|
Driver | rock-5.7.1-6.2.26-V1.5 |
DTK | DTK-24.04.3 |
Supported Backends | Verified |
---|---|
llama-box | Yes |
vox-box | Yes |
Setup Instructions
Install Driver and DTK
The method of installing the Driver and DTK is the same as that in the Docker part above.
Installing GPUStack
Once the environment is set up, install GPUStack following the installation guide.
After installation, GPUStack will detect Hygon DCUs automatically.
Example:
Running Inference
After installation, you can deploy models and run inference. Refer to the model management for usage details.
Note
vllm backend is not supported in non-Docker deployment.