LLM Token Counter & Cost Calculator — Claude, GPT, Llama, Gemini

What is the LLM Token Counter?

Every modern LLM bills by the token — sub-word units produced by each provider's tokenizer. "Hello world" is two tokens to GPT, three to Claude, and yet another count to Llama or Gemini. When you're comparing providers, drafting a long system prompt, or sizing a batch job, you want a single place where you can paste your text and see all the relevant counts and costs side-by-side. That's what this tool does — across Claude (Opus 4.7, Sonnet 4.6, Haiku 4.5), OpenAI (GPT-5, GPT-4o, GPT-4 Turbo), Llama 3.1 405B, and Gemini 2.5 Pro.

How the counts are computed

OpenAI / GPT — uses the js-tiktoken library (the cl100k_base / o200k_base encodings used by GPT-4 and GPT-4o). Loaded lazily from a CDN on first count.
Llama 3 / 3.1 — uses llama-tokenizer-js, which ships the same SentencePiece BPE Meta uses. Loaded lazily on first count.
Claude — Anthropic does not publish a browser tokenizer for Claude 3.5+. We use the standard chars / 3.5 heuristic, which is within roughly ±10% of the real count for English text. Marked with *.
Gemini 2.5 — Google's official tokenizer is server-side. We use a chars / 4 heuristic, also marked with *.

How the cost column works

The "This call" column multiplies the input-token count by each model's per-million input price, then adds the cost of the assumed output tokens (default 0). Increase the Output tokens (assumed) value to simulate a generation as well — useful when you're estimating the all-in cost of a prompt + completion. Prices are stored in a single config block in the JS so they're easy to update; the snapshot in the table is from 2026-05-15. Always verify against the provider's pricing page before committing to a budget.

Why count tokens at all?

Token counts drive three things: cost (per-million-token pricing), latency (longer prompts take longer to process), and context window (you cannot exceed the model's limit). Knowing the count before you submit a prompt lets you trim instructions, decide between a Haiku-class model and an Opus-class one, or prove to a stakeholder that your batch job is feasible. Everything in this tool runs in your browser — your prompt is never sent to a server and is never logged.

LLM Token Counter

What is the LLM Token Counter?

How the counts are computed

How the cost column works

Why count tokens at all?

Send us feedback

Thanks — your message is on its way.

LLM Token Counter

What is the LLM Token Counter?

How the counts are computed

How the cost column works

Why count tokens at all?

Related Tools