Model File Management

GPUStack allows admins to download and manage model files.

Add Model File

GPUStack currently supports models from Hugging Face, ModelScope, Ollama, and local paths. To add model files, navigate to the Resources page and click the Model Files tab.

Add a Hugging Face Model

Click the Add Model File button and select Hugging Face from the dropdown.
Use the search bar in the top left to find a model by name, e.g., Qwen/Qwen2.5-0.5B-Instruct. To search only for GGUF models, check the GGUF checkbox.
(Optional) For GGUF models, select the desired quantization format from Available Files.
Select the target worker to download the model file.
(Optional) Specify a Local Directory to download the model to a custom path instead of the GPUStack cache directory.
Click the Save button.

Add a ModelScope Model

Click the Add Model File button and select ModelScope from the dropdown.
Use the search bar in the top left to find a model by name, e.g., Qwen/Qwen2.5-0.5B-Instruct. To search only for GGUF models, check the GGUF checkbox.
(Optional) For GGUF models, select the desired quantization format from Available Files.
Select the target worker to download the model file.
(Optional) Specify a Local Directory to download the model to a custom path instead of the GPUStack cache directory.
Click the Save button.

Add an Ollama Model

Warning

As of version v0.6.1, Ollama model source is deprecated. For more context, please refer to the GitHub issue.

Click the Add Model File button and select Ollama Library from the dropdown.
Select a model from the dropdown list or input a custom Ollama model, e.g., llama3, llama3:70b, or youraccount/llama3:70b.
Select the target worker to download the model file.
(Optional) Specify a Local Directory to download the model to a custom path instead of the GPUStack cache directory.
Click the Save button.

Add a Local Path Model

You can add models from a local path. The path can be a directory (e.g., a Hugging Face model folder) or a file (e.g., a GGUF model) located on the worker.

Click the Add Model File button and select Local Path from the dropdown.
Enter the Model Path.
Select the target worker.
Click the Save button.

Retry Download

If a model file download fails, you can retry it:

Navigate to the Resources page and click the Model Files tab.
Locate the model file with an error status.
Click the ellipsis button in the Operations column and select Retry Download.
GPUStack will attempt to download the model file again from the specified source.

Deploy Model

Models can be deployed from model files. Since the model is stored on a specific worker, GPUStack will add a worker selector using the worker-name key to ensure proper scheduling.

Navigate to the Resources page and click the Model Files tab.
Find the model file you want to deploy.
Click the Deploy button in the Operations column.
Review or adjust the Name, Replicas, and other deployment parameters.
Click the Save button.

Delete Model File

Navigate to the Resources page and click the Model Files tab.
Find the model file you want to delete.
Click the ellipsis button in the Operations column and select Delete.
(Optional) Check the Also delete the file from disk option.
Click the Delete button to confirm.