Run a local gateway that routes grunt work to Ollama on your machine. Frontier reasoning forwards to the cloud. Zero cloud cost on 70% of requests.
When your coding agent reads a file or checks git status, the request travels to a cloud API and back. You pay for the round trip, the compute, and the tokens. But a local model running on your machine would give an identical response for free.
The Tokonomy Local Gateway is a Docker container that runs on your machine. It classifies each request and routes janitorial tasks to local Ollama. Frontier tasks forward to the Tokonomy cloud proxy for compression, routing, and analytics. Cloud rate limits (429s) auto-retry on local.
Install Ollama and pull a model (ollama pull llama3.2)
Run the gateway: docker run tokonomyai/gateway with your proxy URL
Point your tool at localhost:5177 instead of the provider URL
Janitorial tasks run locally. Frontier tasks forward to the cloud. Your tool sees no difference
Create an account, add your first app, and swap one URL. Takes about 5 minutes.
Get Started Free