Save every token
on your AI.

Cut 40-70% of your AI token costs. Terse auto-compresses every prompt, fixes typos, deduplicates context, and tracks spend — across ChatGPT, Claude Code, Cursor, and OpenClaw.

OpenClaw — ChatAgent
claude-sonnet-4-6
Connected
0 turns 0 in 0 out 0 cache $0.000 0 saved
Google Chrome — ChatGPTBrowser
Ready
VS Code — Cursor ChatEditor
main.js
optimizer.js
capture.js
preload.js
1 const optimizer = new PromptOptimizer();
2 const result = optimizer.optimize(text);
3 console.log(result.stats);
Cursor Chat · Aggr Mode
Ready
Terminal — Claude CodeAgent
Claude Code detected Connect
0Turns
0Input
0Output
$0Cost
Terse — Session ManagerMulti-App
Google Chrome
ChatGPT — claude.ai
ACTIVE
VS Code
terse-project — main.js
Cursor
my-project — index.ts
+ Add Session
C
ChatGPT
CC
Claude Code
OC
OpenClaw
Cu
Cursor
Ai
Aider
VS
VS Code
Sa
Safari
Ch
Chrome
Cl
Claude.ai
Gm
Gemini
Wi
Windsurf
Cp
Copilot
C
ChatGPT
CC
Claude Code
OC
OpenClaw
Cu
Cursor
Ai
Aider
VS
VS Code
Sa
Safari
Ch
Chrome
Cl
Claude.ai
Gm
Gemini
Wi
Windsurf
Cp
Copilot

Compress every prompt

Strips filler, fixes typos, shortens phrases, and compresses verbose text — automatically, before send. You type naturally, the model gets 40-70% fewer input tokens.

Optimize agent sessions

Terse monitors Claude Code, OpenClaw, and Cursor sessions in real time — compressing your messages, trimming redundant context, tracking tool calls, cache hits, and per-turn cost.

Eliminate wasted tokens

Catches typos before they cause retry loops, deduplicates repeated context across turns, drops low-value pleasantries, and prevents the conversation history bloat that makes agents expensive.

Works with
Agent Token Optimization

Every turn optimized,
automatically.

Agent sessions are token-intensive — a single task can consume 50x more tokens than a simple chat. Terse attacks this from every angle: compressing your prompts, deduplicating repeated context across turns, catching typos that cause retry loops, and tracking where every token goes.

  • Compresses every user message before it hits the API
  • Deduplicates context — catches repeated instructions across turns
  • Fixes typos before they cause agent retry loops
  • Tracks input, output, cache reads, tool calls per turn
  • Monitors JSONL session logs for real-time cost visibility
  • Savings compound — 5-turn session saves 200-400 tokens
Claude CodeOpenClawAiderCursor Agent
Agent Session — Live Optimization
1 You type in agent session
 
 
2 Terse optimizes — compress + dedup + fix
1Typo Correction
2Context Dedup
3Filler Removal
4Prompt Compression
5History Trimming
6Imperative Rewrite
7Final Cleanup
3 Optimized prompt auto-sent to agent
 
 
4 Agent runs — Terse tracks everything
 
 
 
 
 
5 Cumulative savings — per-turn cost breakdown
Turn
Input tok
Output tok
Cache
Tools
Tok saved
Typos
Cost
 
Works Everywhere

Runs on
any app, automatically.

Connect Terse to Chrome, Cursor, VS Code, OpenClaw, or any terminal — it auto-detects text fields via macOS Accessibility and starts optimizing immediately. No plugins to install, no workflows to change. Just open the app and Terse is already working.

  • Auto-detects any text input — browsers, IDEs, terminals, agents
  • 7-stage pipeline runs on every prompt before send
  • Code blocks protected · on-device · zero latency
Pipeline — Live
1Spell Correction
2Whitespace
3Pattern Optimization
4Redundancy Elimination
5NLP Analysis
6Aggressive Compression
7Final Cleanup
Live Token Optimization

Every prompt
rewritten live.

As you type, Terse edits your prompt in real time — fixing typos, stripping filler, compressing verbose phrasing. The optimized version is what gets sent to the model. Fewer tokens per message means lower cost and better responses.

  • Rewrites every prompt before it's sent — automatically
  • Context-aware: "what souls I do" → "what should I do"
  • Safe: skips ALL-CAPS, Capitalized, code tokens
Spell Correction — Live
TYPOS Dict
Norvig
Context
macOS Spellcheck
20+ Techniques

Every wasted
token found.

Filler removal, question-to-imperative, Jaccard deduplication, telegraph compression — each technique targets a different source of token waste in both manual prompts and agent conversations.

  • 130+ phrase-shortening rules for verbose prompts
  • Semantic dedup — catches repeated questions across turns
  • Low-info dropping — removes "thanks" and pleasantries
Techniques — Live

Filler Removal

Question → Imperative

Phrase Shortening

Semantic Dedup

Telegraph Style

Low-Info Drop

Three Modes

You control
how much to save.

Different contexts need different levels. Soft for careful prompts, Normal for everyday use, Aggressive for agent sessions where every token counts against your bill.

  • Soft: Typo-fix only. Perfect for critical prompts.
  • Normal: Strips filler and hedging. Best for chat.
  • Aggressive: Max savings. Ideal for agents + APIs.
Mode Comparison
Soft
Normal
Aggr
"I was just wondering if you could perhaps help me understand how to implement a binary search tree in Python please?"
22 tok
0%
Agent Monitor

See everything
your agent does.

Terse auto-detects Claude Code, OpenClaw, Aider, and Cursor Agent. It tails session logs in real time — tracking tokens, cost, tool calls, cache hits, typos caught, and cumulative savings.

  • Live token tracking: input, output, cache reads
  • Per-session cost estimation + savings potential
  • Spellcheck on every prompt before it's sent
  • Tool call tracking + activity feed
Claude CodeOpenClawAiderCursor Agent
Agent Monitor — Live
Claude Code detected — monitor session?Connect
0Turns
0Input
0Output
$0Cost
0Cache
0Tools
0Typos
0sDuration
Prompt savings
Model:claude-opus-4-6Streaming
Benchmarks

Tested on
real sessions.

Tested on real ChatGPT prompts, Claude Code sessions, and agent workflows. Clean technical prompts pass through untouched. Verbose prompts and agent messages see 40-70% reduction.

  • Benchmarked across manual prompts and agent turns
  • Clean prompts correctly return 0% — no false changes
Benchmarks — Aggressive Mode
Typo-heavy rambling
-51%
Verbose Docker Q
-60%
Chatty debug request
-46%
Mixed typos + filler
-64%
Well-written (Normal)
-63%
Repeated questions
-28%
Clean technical
0%

...and loved by developers

Engineers and AI power users cutting costs and gaining visibility into their token usage.

M
Marcus Chen
@marcuschen_dev
Been using @TerseApp with Claude Code for a week. Token usage dropped ~40% on agent sessions. The spellcheck alone saves me from costly correction loops — and the monitor shows exactly where tokens go.
Claude Code
S
Sarah Kim
@sarahk_ai
I type verbose ChatGPT prompts out of habit. Terse catches all my filler words and hedging in real time. -60% tokens on average. It's like Grammarly for token efficiency.
ChatGPT
J
Jake Ortiz
@jakeortiz
The agent monitor alone is worth it. I can see input/output/cache per turn, tool call costs, and which Cursor sessions are burning the most tokens. Finally have visibility into agent spend.
Cursor Agent
A
Amara Patel
@amara_codes
Runs 100% on-device. No API calls, no cloud. As someone who works with sensitive codebases, this was the only token optimizer I'd actually trust. Privacy-first done right.
Privacy
R
Ravi Nguyen
@ravi_ng
Set up OpenClaw + Terse and my API bill dropped immediately. Auto-mode rewrites prompts before send, the monitor tracks every turn's cost, and catching typos means fewer "sorry, I meant..." follow-ups.
OpenClaw
E
Elena Vasquez
@elena_v
The three modes are perfect. Soft for important prompts where every word matters, Aggressive for quick throwaway questions. Terse adapts to how I work, not the other way around.
3 Modes

Built on research.

Grounded in LLMLingua, Norvig spelling, selective context pruning, and real-world prompt analysis.

0
Pipeline stages
0
Token reduction techniques
0
Typo corrections built-in
0
Apps & agents supported

Stop wasting
tokens and money.

Every prompt optimized. Every typo caught. Every agent session tracked. All on your machine — free, private, no cloud required.

GitHub
100% on-device Free & open source Zero latency