What is the LLM Token Counter?
Every modern LLM bills by the token — sub-word units produced by each provider's tokenizer. "Hello world" is two tokens to GPT, three to Claude, and yet another count to Llama or Gemini. When you're comparing providers, drafting a long system prompt, or sizing a batch job, you want a single place where you can paste your text and see all the relevant counts and costs side-by-side. That's what this tool does — across Claude (Opus 4.7, Sonnet 4.6, Haiku 4.5), OpenAI (GPT-5, GPT-4o, GPT-4 Turbo), Llama 3.1 405B, and Gemini 2.5 Pro.
How the counts are computed
- OpenAI / GPT — uses the js-tiktoken library (the
cl100k_base/o200k_baseencodings used by GPT-4 and GPT-4o). Loaded lazily from a CDN on first count. - Llama 3 / 3.1 — uses llama-tokenizer-js, which ships the same SentencePiece BPE Meta uses. Loaded lazily on first count.
- Claude — Anthropic does not publish a browser tokenizer for Claude 3.5+. We use the standard
chars / 3.5heuristic, which is within roughly ±10% of the real count for English text. Marked with *. - Gemini 2.5 — Google's official tokenizer is server-side. We use a
chars / 4heuristic, also marked with *.
How the cost column works
The "This call" column multiplies the input-token count by each model's per-million input price, then adds the cost of the assumed output tokens (default 0). Increase the Output tokens (assumed) value to simulate a generation as well — useful when you're estimating the all-in cost of a prompt + completion. Prices are stored in a single config block in the JS so they're easy to update; the snapshot in the table is from 2026-05-15. Always verify against the provider's pricing page before committing to a budget.
Why count tokens at all?
Token counts drive three things: cost (per-million-token pricing), latency (longer prompts take longer to process), and context window (you cannot exceed the model's limit). Knowing the count before you submit a prompt lets you trim instructions, decide between a Haiku-class model and an Opus-class one, or prove to a stakeholder that your batch job is feasible. Everything in this tool runs in your browser — your prompt is never sent to a server and is never logged.