Running Inference on Copilot+ PCs with Snapdragon X
GPUStack supports running on ARM64 Windows, enabling use on Snapdragon X-based Copilot+ PCs.
Note
Only CPU-based inference is supported on Snapdragon X devices. GPUStack does not currently support GPU or NPU acceleration on this platform.
Prerequisites
- A Copilot+ PC with Snapdragon X. In this tutorial, we use the Dell XPS 13 9345.
- Install AMD64 Python (version 3.10 to 3.12). See details
Installing GPUStack
Run PowerShell as administrator (avoid using PowerShell ISE), then run the following command to install GPUStack:
Invoke-Expression (Invoke-WebRequest -Uri "https://get.gpustack.ai" -UseBasicParsing).Content
After installation, follow the on-screen instructions to obtain credentials and log in to the GPUStack UI.
Deploying a Model
- Navigate to the
Models
page in the GPUStack UI. - Click on the
Deploy Model
button and selectHugging Face
from the dropdown. - Enable the
GGUF
checkbox to filter models by GGUF format. - Use the search bar in the top left to search for the model name
Qwen/Qwen2.5-0.5B-Instruct-GGUF
. - Click
Save
to deploy the model.
Once deployed, you can monitor the model's status on the Models
page.
Running Inference
Navigate to the Playground
page in the GPUStack UI, where you can interact with the deployed model.