Running Inference on Copilot+ PCs with Snapdragon X
GPUStack supports running on ARM64 Windows, enabling use on Snapdragon X-based Copilot+ PCs.
Note
Only CPU-based inference is supported on Snapdragon X devices. GPUStack does not currently support GPU or NPU acceleration on this platform.
Prerequisites
- A Copilot+ PC with Snapdragon X. In this tutorial, we use the Dell XPS 13 9345.
- Install AMD64 Python (version 3.10 to 3.12). See details
Installing GPUStack
Run PowerShell as administrator (avoid using PowerShell ISE), then run the following command to install GPUStack:
Invoke-Expression (Invoke-WebRequest -Uri "https://get.gpustack.ai" -UseBasicParsing).Content
After installation, follow the on-screen instructions to obtain credentials and log in to the GPUStack UI.
Deploying a Model
- Navigate to the
Deploymentspage in the GPUStack UI. - Click on the
Deploy Modelbutton and selectHugging Facefrom the dropdown. - Enable the
GGUFcheckbox to filter models by GGUF format. - Use the search bar in the top left to search for the model name
Qwen/Qwen2.5-0.5B-Instruct-GGUF. - Click
Saveto deploy the model.
Once deployed, you can monitor the model deployment's status on the Deployments page.
Running Inference
Navigate to the Playground page in the GPUStack UI, where you can interact with the deployed model.

