// follow the current
Everything a coding agent should do — on your own machine.
Your code never leaves your PC.
Every model call, edit and memory write runs locally through Ollama. There’s no server to send anything to — and you can prove it with netstat.
TCP 127.0.0.1:8765 LISTENING — local only
$ curl 127.0.0.1:8765/ready 200 OK (no calls out)
Say what you want. It builds it.
Vibe-code on your own machine: describe the change in plain English — “add retry logic and a test” — and Riverforge plans it, edits the files, runs the tests and hands you the diff. No incantation-prompts, no copy-paste, no cloud — and it won’t say “done” until the tests actually pass.
An agent that doesn’t bluff.
Every edit shows up as a diff and gets a second, near-deterministic review. A run won’t finish until your tests or linter pass, and it can’t cite a file it never opened or describe a deliverable that’s still a placeholder. Approve each step, or let it run on autopilot — and undo a whole run in one click.
A brain that remembers — and barely slows down.
SQLite is the source of truth and FAISS does the search, split across three scopes — you, this project, your research. Recall is ~0.36 ms, and still ~3 ms across a million memories. Facts strengthen each time they’re used and gently fade when they’re not — and nothing is ever hard-deleted.
An assistant that becomes yours.
It starts almost blank and earns a consistent character from real work with you — learning your style and forming its own opinions, modelling you as richly as itself. The voice has personality; the part that edits your files stays strict and predictable.
79 tools — it can touch your whole stack.
Read and edit files, run shells and tests, drive git, query a database, call an HTTP API, search the web, write to memory — 79 built-in tools, each checked before it runs. You never name them: just say what you want and Riverforge picks the right one. Need more? Register your own HTTP tools.
16K of context that reads like 100K.
Clever context engineering — an append-only evidence menu, large-file outlines and cross-run recall — lets it work across codebases far bigger than its window without losing the thread or guessing at what it hasn’t seen.
Offline by default. Online on request.
It needs no internet to work. Ask it to research and it searches the web, reads the real pages and cites them — then files what it learned into a separate research brain with the source URL, a hash and a confidence score, ready to reuse.
VS Code, a tray, a CLI — even your phone.
A dockable VS Code panel with diffs, image cards, attachments, undo and an // AI? comment watcher. A tray app that frees your VRAM for a game in one click. A live memory visualiser, a full command line, and phone access over an outbound relay that opens no ports.
Built for 8 GB. Scales as you do.
Point it at any Ollama model — Gemma, Qwen, Llama, your own fine-tune — and it adapts sampling, prompts and reasoning (off / on / adaptive). Designed for an 8 GB-VRAM, 32 GB-RAM machine; give it more and it scales straight up — bigger models, longer context, a deeper brain. It gets smarter as your hardware does.
// the specs
Built lean, on purpose.
Code with an agent that’s yours.
One Windows installer and a VS Code extension. It sets up Ollama, pulls a model, and you’re pair-programming with a fully private agent in minutes. Not released yet — free, in private beta, public download coming soon.