Integrate with Dify
Dify can integrate with GPUStack to leverage locally deployed LLMs, embeddings, reranking, image generation, Speech-to-Text and Text-to-Speech capabilities.
Deploying Models
- In GPUStack UI, navigate to the
Models
page and click onDeploy Model
to deploy the models you need. Here are some example models:
- qwen3-8b
- qwen2.5-vl-3b-instruct
- bge-m3
- bge-reranker-v2-m3
- In the model’s Operations, open
API Access Info
to see how to integrate with this model.
Create an API Key
-
Navigate to the
API Keys
page and click onNew API Key
. -
Fill in the name, then click
Save
. -
Copy the API key and save it for later use.
Integrating GPUStack into Dify
- Access the Dify UI, go to the top right corner and click on
PLUGINS
, selectInstall from Marketplace
, search for the GPUStack plugin, and choose to install it.
- After installed, go to
Settings > Model Provider > GPUStack
, then selectAdd Model
and fill in:
-
Model Type: Select the model type based on the model.
-
Model Name: The name must match the model name deployed on GPUStack.
-
Server URL:
http://your-gpustack-url
, do not uselocalhost
, as it refers to the container’s internal network. If you’re using a custom port, make sure to include it. Also, ensure the URL is accessible from inside the Dify container (you can test this withcurl
). -
API Key: Input the API key you copied from previous steps.
Click Save
to add the model:
Add other models as needed, then select the added models in the System Model Settings
and save:
You can now use the models in the Studio
and Knowledge
, here is a simple case:
- Go to
Knowledge
to create a knowledge, and upload your documents:
- Configure the Chunk Settings and Retrieval Settings. Use the embedding model to generate document embeddings, and the rerank model to perform retrieval ranking.
- After successfully importing the documents, create an application in the
Studio
, add the previously created knowledge, select the chat model, and interact with it:
- Switch the model to
qwen2.5-vl-3b-instruct
, remove the previously added knowledge base, enableVision
, and upload an image in the chat to activate multimodal input: