Self-Hosted / On-Device AI Writing Options
If your privacy requirement is absolute — text that simply cannot go to anyone else’s server, ever — then no cloud promise is enough. You need the model running on hardware you control. There are two flavors of that: on-device (the model runs on your own laptop/workstation) and self-hosted (you run the model on a server you operate). This page maps both, honestly.
On-device: the model runs on your machine
The simplest fully-private setup is a model running directly on the computer you’re typing on. Tools like Ollama make this approachable: install, pull an open-weight model (Llama, Mistral, Qwen, Gemma, Phi), and it runs in the background.
What you get: text never leaves the machine, zero per-edit cost, full offline operation, and no third party to trust or contract with.
What it costs you: the model is smaller and less capable than frontier cloud models, it uses your CPU/GPU and RAM while it runs, and quality is bounded by your hardware. For the routine edits that make up most of a writing day — grammar, tone, short rewrites — a good 7–8B model on-device is genuinely sufficient. For deep, long, nuanced work, you’ll feel the ceiling.
Right for: individuals with a capable machine who handle sensitive text, work offline, or just refuse on principle to send their words to a server.
Self-hosted: the model runs on a server you control
The next step up is running the model on your own server — on-premise hardware or a private cloud instance your team controls. Frameworks like vLLM, Ollama (server mode), LM Studio, or Text Generation Inference serve a model over your private network.
What you get: the privacy of local plus the ability to run bigger models (a beefy server can host models a laptop can’t), shared across a team, all inside your perimeter. Nothing crosses your firewall.
What it costs you: real infrastructure work — provisioning, GPUs, maintenance, scaling, security. This is an IT project, not a checkbox.
Right for: teams and regulated organizations (legal, health, finance) that need centralized, in-perimeter AI and have the IT capacity to run it. Often the only option for high-security or air-gapped environments — see behind a corporate firewall.
The open-weight model landscape (briefly)
Self-hosting and on-device both rely on open-weight models — models whose files are published to run locally. The field moves fast, but the families to know:
- Llama (Meta) — broad, widely supported, many sizes.
- Mistral / Mixtral — strong quality-per-size, efficient.
- Qwen — competitive, strong multilingual.
- Gemma (Google) — lightweight, good for on-device.
- Phi (Microsoft) — small models punching above their weight, ideal for quick edits on modest hardware.
For writing edits specifically, you don’t need the largest model — you need one that’s fast and clean at grammar, tone, and rewriting, which the smaller models handle well.
Where each option lands vs cloud
| Option | Privacy | Quality ceiling | Effort | Best for |
|---|---|---|---|---|
| On-device | Highest (nothing leaves) | Medium | Low–medium | Individuals with capable hardware |
| Self-hosted | Highest (in-perimeter) | Medium–high | High (IT project) | Teams, regulated orgs |
| BYOK cloud | High (vendor out of path) | Highest | Low | Power users, privacy + quality |
| Managed cloud | Good (no-logging) | Highest | None | Most professionals |
The pragmatic middle path — local for sensitive, cloud for hard — is smart local↔cloud routing.
EditSnappy and the fully-local path
EditSnappy’s audience includes people whose privacy bar is exactly this high. A local/on-device path is on the roadmap as the strongest expression of the product’s privacy promise, but it is not a confirmed shipping capability:
[[MISSING: confirm whether EditSnappy supports on-device / self-hosted / local models. These are silo topics; master-sales-copy lists only “smart local↔cloud routing” as a reach goal (§5). Teach the options generally; do not claim EditSnappy runs local/self-hosted until Ken confirms.]] [[MISSING: pricing model — a local/BYOK path’s availability ties to the pricing decision (master-sales-copy §8 option B).]]
What EditSnappy commits to today: no logging or retention of your text on its managed path, plus diff-before-commit and one-key undo so you control every change. If your requirement is strictly “nothing leaves my machine,” follow the on-device/self-hosted options above and confirm EditSnappy’s local support before relying on it.
See the full trust stack on the Privacy, Security & BYOK hub, or try EditSnappy free — no credit card.