Running Inference on Copilot+ PCs with Snapdragon X

GPUStack supports running on ARM64 Windows, enabling use on Snapdragon X-based Copilot+ PCs.

Note

Only CPU-based inference is supported on Snapdragon X devices. GPUStack does not currently support GPU or NPU acceleration on this platform.

Prerequisites

Run PowerShell as administrator (avoid using PowerShell ISE), then run the following command to install GPUStack:

Invoke-Expression (Invoke-WebRequest -Uri "https://get.gpustack.ai" -UseBasicParsing).Content

After installation, follow the on-screen instructions to obtain credentials and log in to the GPUStack UI.

Navigate to the Deployments page in the GPUStack UI.
Click on the Deploy Model button and select Hugging Face from the dropdown.
Enable the GGUF checkbox to filter models by GGUF format.
Use the search bar in the top left to search for the model name Qwen/Qwen2.5-0.5B-Instruct-GGUF.
Click Save to deploy the model.

Once deployed, you can monitor the model deployment's status on the Deployments page.

Navigate to the Playground page in the GPUStack UI, where you can interact with the deployed model.