Model Catalog
The Model Catalog is an index of GPUStack-tuned models.
Browse Models
You can browse the Model Catalog by navigating to the Catalog page. You can filter models by name and category. The following screenshot shows the Model Catalog page:
Deploy a Model from the Catalog
You can deploy a model from the Model Catalog by clicking the model card. A model deployment configuration page will appear. You can review and customize the deployment configuration and click the Save button to deploy the model.
Customize Model Catalog
You can customize the Model Catalog by providing a YAML file via GPUStack server configuration using the --model-catalog-file flag. It accepts either a local file path or a URL. You can refer to the built-in model catalog file here for the schema.
The following is an example of a custom model catalog YAML file:
Using Model Catalog in Air-Gapped Environments
The built-in model catalog sources models from either Hugging Face or ModelScope. If you are using GPUStack in an air-gapped environment without internet access, you can customize the model catalog to use a local-path model source. Here is an example:
Model Catalog Schema
The Model Catalog YAML file contains two main sections: draft_models and model_sets.
draft_models: A list of draft models for speculative decoding.model_sets: A list of model sets that are tested and optimized.
Each draft model has the following fields:
| Field | Type | Description |
|---|---|---|
| name | string | The name of the draft model. |
| algorithm | string | The speculative decoding algorithm of the model. Currently, only eagle3 is supported. |
| source | string | The source of the model (e.g., huggingface, model_scope). |
| huggingface_repo_id | string | The Hugging Face repository ID of the model (if source is huggingface). |
| model_scope_model_id | string | The ModelScope repository ID of the model (if source is model_scope). |
Each model set has the following fields:
| Field | Type | Description |
|---|---|---|
| name | string | The name of the model. |
| description | string | A brief description of the model. |
| home | string | The homepage URL of the model. |
| icon | string | The icon URL of the model. |
| categories | list of str | A list of categories that the model belongs to. |
| capabilities | list of str | A list of capabilities of the model. |
| size | int | The size of the model in billions of parameters. |
| licenses | list of str | A list of licenses of the model. |
| release_date | string | The release date of the model in YYYY-MM-DD format. |
| specs | list of spec | A list of deployment specifications for the model. |
Each deployment spec has the following fields:
| Field | Type | Description |
|---|---|---|
| mode | string | GPUStack provides both conventional and optimized modes for different use cases, including throughput, latency, and standard scenarios. Users can also define custom modes as needed. |
| quantization | string | The quantization type (e.g., FP16, FP8, INT8). |
| gpu_filters | dict | GPU filters to specify compatible GPUs. |
Other fields in a deployment spec are similar to the models API fields. For more details, see the API Reference documentation.
