Inference Backend Management

GPUStack allows admins to configure inference backends and backend versions.

This article serves as an operational guide for the Inference Backend page. For supported built-in backends and their capabilities, see Built-in Inference Backends.

For guidelines for configuring custom backends and examples of custom backends that have been verified to work, see Custom Inference Backends.

Parameter Description

Parameter Name	Description	Required
Name	Inference backend name	Yes
Health Check Path	Health check path used to verify the backend is up and responding. Default: /v1/models (OpenAI-compatible).	No
Default Execution Command	Container startup command/args. For example (vLLM): `vllm serve {{model_path}} --port {{port}} --served-model-name {{model_name}} --host {{worker_ip}}`. The placeholders `{{model_path}}`, `{{model_name}}`, `{{port}}`, and `{{worker_ip}}` are automatically substituted when the deployment is scheduled to a worker.	No
Default Backend Parameters	Pre-populate the Advanced Backend Parameters section during deployment; you can adjust them before launching	No
Description	Description	No
Version Configs	Configure available versions of this backend	Yes
Default Version	Preselected during deployment. If you don’t choose a version, its image is used	No

Version Configs parameter description:

Parameter Name	Description	Required
Version	Version name shown in the Backend Version dropdown during deployment	Yes
Image Name	Container image name for the backend (e.g., `ghcr.io/org/image:tag`)	Yes
Framework (custom_framework)	Backend framework (internal identifier: `custom_framework`). Deployment and scheduling are filtered by supported frameworks	Yes
Execution Command	Version-specific startup command. If omitted, the Default Execution Command is used	No

Add Custom Inference Backend

Click the "Add Backend" button in the top-right corner.
You can add a custom inference backend by completing the form or by pasting a YAML definition. Refer to the parameter descriptions above for field meanings.
The backend name cannot be modified after creation. Custom backend names must end with "-custom" (pre-filled in the form).
Click "Save" to submit.

Edit Inference Backend or Add Custom Version

On the Inference Backend page, locate the target backend. From the card's top-right dropdown menu, choose "Edit".
Modify backend properties (the name cannot be changed), or add a new version.
For built-in backends, custom versions must end with "-custom" (pre-filled in the form).
Click "Save" to submit.

Delete Custom Inference Backend

On the Inference Backend page, locate the target backend and select "Delete" from the card's top-right dropdown menu.
Built-in backends cannot be deleted.
Click "Delete" in the confirmation dialog.

List Versions of Inference Backend

On the Inference Backend page, click anywhere on the backend card (except the action buttons) to open a modal where you can browse all built-in and custom-added versions.

Flexible Testing Deployment

Use this mode to quickly verify or tweak the image and startup command without editing the backend definition.

Navigate to the Deployments page, click the "Deploy Model" button, and choose any model source.
In the Basic tab, open the "Backend" dropdown and select "Custom" under the "Built-in" section.
Two fields appear: image_name and run_command. These override the backend configuration for this deployment only.
Review the remaining required settings and submit the deployment.