Question 1

Does optimization affect output quality?

Accepted Answer

No. Masking removes context the model won't reference again. Routing only downgrades tasks where cheaper models produce identical results. Compression preserves all semantic content.

Question 2

How long does setup take?

Accepted Answer

About 5 minutes. Swap one URL in your tool's settings. No SDK, no code changes, no dependencies.

Question 3

What providers are supported?

Accepted Answer

Claude (Anthropic), ChatGPT (OpenAI), and Gemini (Google). The proxy handles format conversion transparently.

Question 4

Is my data stored?

Accepted Answer

No. Prompts are processed in memory and never persisted. Only usage metadata (model, token counts, costs) is retained.

Question 5

Can I control how aggressive the optimization is?

Accepted Answer

Yes. Compression has profiles from Low (12%) to High (72%). Routing and masking are separate toggles. You can enable any combination.

LLM Cost Optimization

You're paying for tokens you don't need

The Problem

The Solution

Before and After

How It Works

Frequently Asked Questions

Related Tools

Ready to start saving?