OpenAI Compatible APIs
GPUStack serves OpenAI-compatible APIs using the /v1-openai
path. Most of the APIs also work under the /v1
path as an alias, except for the models
endpoint, which is reserved for GPUStack management APIs.
For all applications and frameworks that support the OpenAI-compatible API, you can integrate and use the models deployed on GPUStack through the OpenAI-compatible API provided by GPUStack.
Supported Endpoints
The following API endpoints are supported:
- List Models
- Create Completion
- Create Chat Completion
- Create Embeddings
- Create Image
- Create Image Edit
- Create Speech
- Create Transcription
Rerank API
In the context of Retrieval-Augmented Generation (RAG), reranking refers to the process of selecting the most relevant information from retrieved documents or knowledge sources before presenting them to the user or utilizing them for answer generation.
It is important to note that the OpenAI-compatible APIs does not provide a rerank
API, so GPUStack serves Jina compatible Rerank API using the /v1/rerank
path.