Setting Up a Multi-node GPUStack Cluster
This tutorial will guide you through setting up a multi-node GPUStack cluster, where you can distribute your workloads across multiple GPU-enabled nodes. This guide assumes you have basic knowledge of running commands on Linux, MacOS, or Windows systems.
Prerequisites
Before starting, ensure you have the following:
- Multiple nodes with supported OS and GPUs for GPUStack installation. View supported platforms and supported accelerators for more information.
- Nodes are connected to the same network and can communicate with each other.
Step 1: Install GPUStack on the Server Node
First, you need to install GPUStack on one of the nodes to act as the server node. Follow the instructions below based on your operating system.
Linux or MacOS
Run the following command on your server node:
curl -sfL https://get.gpustack.ai | sh -s -
Windows
Run PowerShell as administrator and execute the following command:
Invoke-Expression (Invoke-WebRequest -Uri "https://get.gpustack.ai" -UseBasicParsing).Content
Once GPUStack is installed, you can proceed to configure your cluster by adding worker nodes.
Step 2: Retrieve the Token from the Server Node
To add worker nodes to the cluster, you need the token generated by GPUStack on the server node. On the server node, run the following command to get the token:
Linux or MacOS
cat /var/lib/gpustack/token
Windows
Get-Content -Path "$env:APPDATA\gpustack\token" -Raw
This token will be required in the next steps to authenticate worker nodes.
Step 3: Add Worker Nodes to the Cluster
Now, you will install GPUStack on additional nodes (worker nodes) and connect them to the server node using the token.
Linux or MacOS Worker Nodes
Run the following command on each worker node, replacing http://myserver with the URL of your server node and mytoken with the token retrieved in Step 2:
curl -sfL https://get.gpustack.ai | sh -s - --server-url http://myserver --token mytoken
Windows Worker Nodes
Run PowerShell as administrator on each worker node and use the following command, replacing http://myserver and mytoken with the server URL and token:
Invoke-Expression "& { $((Invoke-WebRequest -Uri 'https://get.gpustack.ai' -UseBasicParsing).Content) } --server-url http://myserver --token mytoken"
Once the command is executed, each worker node will connect to the main server and become part of the GPUStack cluster.
Step 4: Verify the Cluster Setup
After adding the worker nodes, you can verify that the cluster is set up correctly by accessing the GPUStack UI.
- Open a browser and navigate to
http://myserver
(replace myserver with the actual server URL). - Log in with the default credentials (username
admin
). To retrieve the default password, run the following command on the server node:
Linux or MacOS
cat /var/lib/gpustack/initial_admin_password
Windows
Get-Content -Path "$env:APPDATA\gpustack\initial_admin_password" -Raw
- After logging in, navigate to the
Resources
page in the UI to see all connected nodes and their GPUs. You should see your worker nodes listed and ready for serving LLMs.
Conclusion
Congratulations! You've successfully set up a multi-node GPUStack cluster! You can now scale your workloads across multiple nodes, making full use of your available GPUs to handle your tasks efficiently.