LLM Gateway

Run a local gateway that routes grunt work to Ollama on your machine. Frontier reasoning forwards to the cloud. Zero cloud cost on 70% of requests.

Get Started Free View Docs

You're sending simple tasks to expensive cloud APIs

The Problem

When your coding agent reads a file or checks git status, the request travels to a cloud API and back. You pay for the round trip, the compute, and the tokens. But a local model running on your machine would give an identical response for free.

The Solution

The Tokonomy Local Gateway is a Docker container that runs on your machine. It classifies each request and routes janitorial tasks to local Ollama. Frontier tasks forward to the Tokonomy cloud proxy for compression, routing, and analytics. Cloud rate limits (429s) auto-retry on local.

Before and After

Cloud Only

read_file → Cloud API ($) git status → Cloud API ($) design system → Cloud API ($) format code → Cloud API ($) Every request hits the cloud

With Local Gateway

read_file → Local Ollama (free) git status → Local Ollama (free) design system → Cloud Proxy ($) format code → Local Ollama (free) ~70% of requests never leave your machine

Token savings: 70%

How It Works

Install Ollama and pull a model (ollama pull llama3.2)

Run the gateway: docker run tokonomyai/gateway with your proxy URL

Point your tool at localhost:5177 instead of the provider URL

Janitorial tasks run locally. Frontier tasks forward to the cloud. Your tool sees no difference

Frequently Asked Questions

What local model servers are supported?

Ollama, LM Studio, LocalAI, llama.cpp, and vLLM. Any server with an OpenAI-compatible or Ollama-compatible API.

Does the gateway work offline?

Janitorial tasks work offline via Ollama. Frontier tasks require cloud connectivity.

How do I get the pre-filled docker run command?

Click the Gateway button on any application in the dashboard. It copies a docker run command with your app ID and proxy URL pre-filled.

What happens if Ollama is down?

Janitorial tasks fall back to the cheapest cloud tier. Your tool never sees an error.

Related Tools

Ready to start saving?

Create an account, add your first app, and swap one URL. Takes about 5 minutes.

Get Started Free