Model File Management
GPUStack allows admins to download and manage model files.
Add Model File
GPUStack currently supports models from Hugging Face, ModelScope, Ollama, and local paths. To add model files, navigate to the Resources
page and click the Model Files
tab.
Add a Hugging Face Model
- Click the
Add Model File
button and selectHugging Face
from the dropdown. -
Use the search bar in the top left to find a model by name, e.g.,
Qwen/Qwen2.5-0.5B-Instruct
. To search only for GGUF models, check theGGUF
checkbox. -
(Optional) For GGUF models, select the desired quantization format from
Available Files
. - Select the target worker to download the model file.
- (Optional) Specify a
Local Directory
to download the model to a custom path instead of the GPUStack cache directory. - Click the
Save
button.
Add a ModelScope Model
- Click the
Add Model File
button and selectModelScope
from the dropdown. - Use the search bar in the top left to find a model by name, e.g.,
Qwen/Qwen2.5-0.5B-Instruct
. To search only for GGUF models, check theGGUF
checkbox. - (Optional) For GGUF models, select the desired quantization format from
Available Files
. - Select the target worker to download the model file.
- (Optional) Specify a
Local Directory
to download the model to a custom path instead of the GPUStack cache directory. - Click the
Save
button.
Add an Ollama Model
- Click the
Add Model File
button and selectOllama Library
from the dropdown. - Select a model from the dropdown list or input a custom Ollama model, e.g.,
llama3
,llama3:70b
, oryouraccount/llama3:70b
. - Select the target worker to download the model file.
- (Optional) Specify a
Local Directory
to download the model to a custom path instead of the GPUStack cache directory. - Click the
Save
button.
Add a Local Path Model
You can add models from a local path. The path can be a directory (e.g., a Hugging Face model folder) or a file (e.g., a GGUF model) located on the worker.
- Click the
Add Model File
button and selectLocal Path
from the dropdown. - Enter the
Model Path
. - Select the target worker.
- Click the
Save
button.
Retry Download
If a model file download fails, you can retry it:
- Navigate to the
Resources
page and click theModel Files
tab. - Locate the model file with an error status.
- Click the ellipsis button in the
Operations
column and selectRetry Download
. - GPUStack will attempt to download the model file again from the specified source.
Deploy Model
Models can be deployed from model files. Since the model is stored on a specific worker, GPUStack will add a worker selector using the worker-name
key to ensure proper scheduling.
- Navigate to the
Resources
page and click theModel Files
tab. - Find the model file you want to deploy.
- Click the
Deploy
button in theOperations
column. - Review or adjust the
Name
,Replicas
, and other deployment parameters. - Click the
Save
button.
Delete Model File
- Navigate to the
Resources
page and click theModel Files
tab. - Find the model file you want to delete.
- Click the ellipsis button in the
Operations
column and selectDelete
. - (Optional) Check the
Also delete the file from disk
option. - Click the
Delete
button to confirm.