Troubleshooting

Most problems come down to one of three things: the server isn’t running, Ollama isn’t reachable, or the context is mis-tuned. Start with the quick reference, then jump to the section you need.

Quick reference

Problem	First thing to check
Chat says server offline	Open the tray, click Start Server, then check Status
VS Code command missing	Reinstall the VSIX and reload VS Code
Server starts but the model is slow	First response loads the model — use Resume VRAM to warm it
GPU needed for another app	Click Pause VRAM
Attachments don’t appear	Drop them on the chat/composer area, or use the + button
Test command blocked	Add the command name in the tray under Tool Config
Responses look truncated	Set `OLLAMA_CONTEXT_LENGTH=16384` and restart Ollama
Ollama errors	Restart Ollama, then run tray Status again

The chat says the server is offline

Open the Riverforge Tray and click Start Server. Wait for the status log to show online.
Check the header dot in VS Code — it turns green when the server is reachable. The extension reconnects on its own.

Still red? Confirm the server is up from a terminal:

PS> (Invoke-WebRequest -UseBasicParsing http://127.0.0.1:8765/ready).StatusCode  # 200

If the tray also says offline, start it from the tray. If Ollama is down, restart it from its tray icon.

The VS Code command is missing

The installer can occasionally land the extension in a different VS Code profile than the one you use. Install the bundled VSIX into your window by hand: Extensions view → … → Install from VSIX… → pick riverforge-vscode.vsix → reload. After an upgrade, if the commands still look old, reinstall the latest VSIX. See Installation.

Responses are slow

The first response after startup or a VRAM pause includes model warmup. Use Resume VRAM to warm the model before you need it.
Check nvidia-smi during generation. If GPU utilisation is under 50%, layers are spilling to CPU — the model may be too large for your VRAM.
If every response cold-loads, confirm OLLAMA_KEEP_ALIVE=-1 and OLLAMA_MAX_LOADED_MODELS=3 are set, then restart Ollama.

Output looks cut off

By default Ollama serves a 2048-token context and silently truncates anything longer. Set OLLAMA_CONTEXT_LENGTH=16384 on the Ollama service and restart it. Riverforge sets this for you during install — this only bites on a hand-tuned or source setup. See Models & Hardware.

“CUDA out of memory” / Ollama crashes

Click Pause VRAM, close other GPU-heavy apps, then Resume VRAM.
Confirm OLLAMA_CONTEXT_LENGTH=16384 and OLLAMA_MAX_LOADED_MODELS=3, then restart Ollama.
Stick with the default model unless you’re deliberately trying a larger one.
If a large model still won’t load, switch back to the 4B default — it’s sized for an 8 GB card.

Riverforge can’t see Ollama

Confirm Ollama is answering: curl http://localhost:11434/api/tags should return JSON.
If a firewall is blocking localhost, add a rule for ollama.exe.
Windows sometimes binds Ollama to IPv6 only — run setx OLLAMA_HOST 127.0.0.1:11434 and restart it.
Restart Ollama from its tray icon, then run the tray Status button again.

It edits files but the tests never pass

Set the correct test/lint command in the tray under Tool Config.
Run that command yourself from the project root to confirm it works outside the agent.
Inspect the chat’s live tool rows for the exact failing command and output.
If the project needs a specific executable, make sure it’s on PATH — any executable on PATH is allowed; only sensitive system paths are blocked.

Windows Defender flags Ollama

Defender occasionally quarantines ollama.exe after an update. Add %LOCALAPPDATA%\Programs\Ollama\ to your exclusions list and reinstall or restore Ollama.

Still stuck?

The tray’s Status button gives you a one-look snapshot of the server, Ollama, the loaded model and your tools — start there. For a deeper look, Riverforge keeps logs in your data folder; open it with the tray’s Open Data Folder button. Those logs are the most useful thing to include if you report a problem.