Terse Documentation
Everything you need to install, configure, and use Terse — the on-device token optimizer and agent monitor for macOS.
Table of Contents
Getting Started
Download and Install
Terse is distributed as a DMG file from GitHub Releases. Download the latest .dmg file for macOS, open it, and drag the Terse app into your Applications folder.
First-Time Setup
Terse is not currently signed with an Apple Developer certificate. On first launch, macOS will block the app. To open it:
- Open Finder and navigate to Applications
- Right-click (or Control-click) on Terse
- Select Open from the context menu
- In the dialog that appears, click Open to confirm
You only need to do this once. After the first launch, macOS will remember your choice and Terse will open normally.
Grant Accessibility Permissions
Terse uses the macOS Accessibility API to read and write text in other applications. On first launch, macOS will prompt you to grant Accessibility access:
- Open System Settings → Privacy & Security → Accessibility
- Click the + button and add Terse
- Ensure the toggle is enabled
Without Accessibility permissions, Terse cannot read text from connected apps or write optimized text back. The optimizer will still work in manual copy-paste mode, but automatic capture and replace will be disabled.
How Terse Works
Terse runs as a lightweight macOS application with two windows:
The Main Window
The main window is a session manager that lists all connected applications. You can open it at any time with Cmd+Shift+T. From here you can add or remove sessions, view optimization statistics, and access settings.
The Popup Window
The popup is a floating bar that appears at the top of your screen when you focus a connected app. It shows live optimization stats, mode controls, and action buttons (Capture, Replace, Copy). The popup uses alwaysOnTop rendering so it never steals focus from your active application.
The 7-Stage Optimization Pipeline
Every prompt that Terse processes passes through a 7-stage pipeline:
- Code block protection — extracts and preserves code blocks, URLs, and inline code so they are never modified
- Spell correction — fixes typos using a hardcoded dictionary for common coding/prompt typos, then runs macOS NSSpellChecker for broader coverage
- Whitespace normalization — collapses multiple spaces, removes trailing whitespace, normalizes line breaks
- Pattern optimization — applies 20+ rule-based transformations to shorten verbose phrases
- NLP analysis — identifies and removes filler words, hedging language, politeness markers, and meta-language
- Telegraph compression — removes articles, shortens syntax, applies abbreviations (Aggressive mode only)
- Code block restoration — reinserts all protected code blocks in their original positions
The pipeline runs entirely on-device with zero latency from network calls. Typical processing time is under 5 milliseconds for prompts up to 4,000 tokens.
Connecting Apps
Terse automatically detects supported applications through a combination of macOS Accessibility API inspection and process scanning. No manual configuration is needed for supported apps.
ChatGPT (Chrome / Safari)
Terse detects ChatGPT via the Accessibility API when Chrome or Safari has a ChatGPT tab in focus. It reads the textarea content directly via AXTextArea / AXValue. Both the compose box and conversation history are accessible.
Claude Code
Claude Code runs as a terminal process. Terse detects it via process scanning (every 5 seconds) and tails the JSONL conversation log to extract token usage, tool calls, and costs. Prompt optimization is applied via the clipboard, since terminal emulators do not expose editable text fields through Accessibility.
Cursor
Cursor is detected through both Accessibility API and process scanning. Terse reads the compose input via Accessibility when available, and falls back to selection-based capture (Cmd+C) for the Electron canvas editor, similar to VS Code.
OpenClaw
OpenClaw is detected via process scanning. Terse tails its conversation logs to extract token metrics and provides optimization for prompts via clipboard integration.
Aider
Aider is detected via process scanning. As a terminal-based tool, Terse monitors its process output and provides prompt optimization through clipboard-based capture and replace.
VS Code
VS Code uses an Electron canvas renderer for its editor, which means the Accessibility API cannot read editor content directly. Terse detects VS Code via Accessibility and falls back to selection-based capture: it uses Cmd+C to copy selected text, optimizes it, and can write it back via Cmd+A + Cmd+V in the target field.
App Detection Notes
Terse prioritizes AXTextArea elements over AXTextField when scanning via Accessibility, since text areas are more likely to be prompt input fields. Results shorter than 5 characters are ignored as they are typically cursor indicators or UI fragments, not actual prompt content.
Agent Monitoring
The agent monitoring system provides real-time visibility into what your AI agents are doing, how much they cost, and where efficiency can be improved.
Session Detection
Terse scans running processes every 5 seconds to detect active agent sessions. When a supported agent process is found (Claude Code, Cursor, OpenClaw, Aider), Terse begins monitoring its activity automatically.
Metrics Tracked
- Input tokens — total tokens sent to the model across all turns
- Output tokens — total tokens generated by the model
- Cache read/write — tokens served from cache vs. freshly computed (for providers that support prompt caching)
- Tool calls — count and type of every tool invocation (Read, Edit, Bash, Grep, Glob, etc.)
- Cost — per-turn and cumulative session cost, computed from provider pricing
- Context fill — percentage of the model's context window currently in use
Insights and Alerts
Terse generates actionable insights based on session activity:
- Context fill warnings — advisory at 60%, critical at 85%, limit warning at 95%
- Duplicate tool calls — flags when the agent invokes the same tool with the same arguments multiple times
- Redundant file reads — detects when a file is read again without intervening edits
- Cost spikes — alerts when a single turn exceeds a configurable cost threshold
- Cache efficiency — reports the ratio of cached vs. uncached input tokens
CLAUDE.md Rule Generation
Based on observed patterns in agent sessions, Terse can generate CLAUDE.md rules that teach agents to avoid wasteful behaviors. For example, if an agent repeatedly reads the same configuration files at the start of every session, Terse generates a rule instructing the agent to skip those reads unless the task specifically requires them. Learn more about how this extends selective context pruning across sessions.
Optimization Modes
Terse offers three optimization modes, each applying progressively more aggressive transformations. You can toggle between modes using the popup bar.
Soft Mode
The lightest optimization level. Soft mode applies only:
- Spell correction — fixes typos via the hardcoded dictionary and macOS NSSpellChecker
- Whitespace normalization — collapses extra spaces and blank lines
Soft mode preserves 100% of the original meaning and intent. It is ideal when you want clean, error-free prompts without any content changes. Typical token reduction: 2-5%.
Normal Mode
The balanced default. Normal mode includes everything in Soft mode plus:
- Filler word removal — strips "just", "actually", "basically", "really", and similar low-information words
- Hedging removal — removes "I think", "maybe", "perhaps", "it seems like" and other uncertainty markers
- Politeness pruning — strips "please", "could you kindly", "would you mind", and similar courtesy phrases that LLMs do not need
- Meta-language removal — removes "I want you to", "what I need is", "the thing is" and other self-referential phrasing
- Pattern optimization — applies 20+ phrase-shortening rules
Normal mode offers a good balance between compression and readability. The optimized text is still natural-sounding but more concise. Typical token reduction: 15-25%.
Aggressive Mode
Maximum compression. Aggressive mode includes everything in Normal mode plus:
- Telegraph compression — removes articles ("the", "a", "an"), shortens sentence structure
- Abbreviation expansion — common phrases shortened to abbreviations where unambiguous
- Markdown stripping — removes formatting markers that consume tokens without affecting LLM comprehension
Aggressive mode may sacrifice some nuance and readability for maximum token savings. The resulting text reads more like shorthand or technical notes. Typical token reduction: 25-40%.
Auto-Mode and Send-Mode
Auto-Mode
When Auto-Mode is enabled (toggled via the popup bar), Terse automatically monitors the active app's text field and writes the optimized version back whenever a change is detected. The workflow is fully hands-free:
- You type or paste a prompt in the connected app
- Terse detects the change via Accessibility polling
- The optimization pipeline runs on the new text
- The optimized text is written back to the app's input field via
AXValue(or Cmd+A + Cmd+V fallback)
Auto-Mode works best with apps that expose editable text fields via Accessibility (ChatGPT in Chrome/Safari, Cursor's compose box). For terminal-based agents like Claude Code and Aider, manual Send-Mode is recommended.
Send-Mode (Manual)
Send-Mode is the default workflow. You control each step:
- Capture — reads the current text from the active app (or use Cmd+Shift+C to capture selected text)
- Replace — writes the optimized text back to the app's input field
- Copy — copies the optimized text to your clipboard for manual pasting
Send-Mode gives you full control over when optimization is applied and lets you review the result before committing it.
Keyboard Shortcuts
| Shortcut | Action | Context |
|---|---|---|
| Cmd + Shift + T | Toggle main window | Global — works from any application |
| Cmd + Shift + C | Capture selected text | Global — captures from the active app and sends to Terse for optimization |
Privacy & Security
Terse is designed with a strict on-device architecture. Your prompts, text content, and agent session data never leave your machine.
- 100% on-device processing — all optimization runs locally. There are no API calls to external services for text processing.
- No cloud, no telemetry — Terse does not phone home, does not collect usage analytics, and does not transmit any data over the network (except for license validation, which sends only your license key).
- No data storage — Terse does not persist your prompts or conversation content to disk. Text is held in memory only during active optimization and discarded immediately after.
- Code protection — code blocks, inline code, and URLs are extracted before optimization and restored after, ensuring they are never modified by the optimization pipeline.
- Accessibility API only — Terse reads app content through the official macOS Accessibility API, the same system used by screen readers and assistive technology. It does not inject code, hook processes, or use private APIs.
Pricing
Free
- 50 optimizations per week
- 1 connected session
- All three optimization modes
- Spell correction
- Basic popup interface
Pro
- Unlimited optimizations
- 3 concurrent sessions
- Agent monitoring dashboard
- Context fill warnings
- Duplicate tool call detection
- CLAUDE.md rule generation
Premium
- Unlimited everything
- Unlimited concurrent sessions
- Advanced agent insights
- Priority support
- Early access to new features
All plans include the full on-device optimization pipeline. Paid plans are billed monthly and can be cancelled at any time. Licenses are validated locally and do not require a persistent internet connection after initial activation.
Tech Stack
Terse is built on a modern, performant stack optimized for macOS:
- Tauri — Rust-based application framework. Handles window management, process scanning, IPC, and the agent monitoring backend. Tauri produces small, fast binaries (~15 MB) compared to Electron alternatives.
- Rust backend — the agent monitor, session manager, and JSONL log parser are implemented in Rust for performance and reliability. Token counting, cost computation, and context analysis run in Rust threads.
- Web frontend — the popup and main window UIs are built with vanilla HTML, CSS, and JavaScript. No framework overhead — the popup renders in under 16ms.
- Swift helper (
terse-ax) — a compiled Swift binary that interfaces with the macOS Accessibility API. Handles reading text from apps (read-app), writing text back (write-pid), and spell checking via NSSpellChecker. - Norvig-style spelling — the spell correction engine uses a probabilistic approach inspired by Peter Norvig's spell corrector, combined with a hardcoded dictionary for common coding and prompt typos.
- nspell / Hunspell — additional dictionary-based spell checking for broader language coverage beyond the hardcoded typo dictionary.
Need Help?
If you run into issues or have questions not covered here, you can:
- File an issue on GitHub Issues
- Check the Releases page for the latest version and changelog
- Explore the technique deep-dives: Spell Correction, Pattern Optimization, NLP Analysis, Telegraph Compression
- Read the research pages: LLMLingua, Norvig Spelling, Selective Context
Ready to Optimize?
Download Terse and start saving tokens in under two minutes. Free tier included.
Download Terse