This tutorial guides you through setting up OpenCode to use a local LLM via llama-swap, using GLM-4.7-Flash-32B as the example model.
Install llama-swap:
Go to https://github.com/mostlygeek/llama-swap and follow an instruction method you need.
See my guide on setting up llama-swap .
Start the llama-swap server (adjust port as needed):
llama-swap-launch.sh
See my guide on creating the launchers.
The server will start on http://localhost:8080/ by default. Connect to it with programs using http://localhost:8080/v1
Edit ~/.local/share/opencode/auth.json to add your llama-swap API key:
you may set any key you like as long as you remember to use it in the other configs
{
"llamaswap": {
"type": "api",
"key": "llama"
}
}
Edit ~/.config/opencode/opencode.json with the following configuration:
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"llamaswap": {
"npm": "@ai-sdk/openai-compatible",
"name": "llama-swap (GLM-4.7-Flash-32B)",
"options": {
"baseURL": "http://192.168.0.69:8080/v1",
"apiKey": "llama"
},
"models": {
"GLM-4.7-Flash-32B": {
"name": "GLM-4.7-Flash-32B"
}
}
}
},
"model": "GLM-4.7-Flash-32B",
"small_model": "GLM-4.7-Flash-32B"
}Test your configuration by running OpenCode and ensuring it connects to llama-swap:
opencode
You should see OpenCode successfully connecting to your local llama-swap instance.