LLM Token & Cost Calculator
Paste any prompt to see the estimated token count and what it would cost across major AI providers.
Runs entirely in your browser
Prompt / Input Text
| Model | Context | Input cost | Output cost | Total cost | Queries / budget |
|---|
Token count is estimated using the ~4 chars/token rule of thumb (±15% for English). Prices are approximate and sourced from provider documentation — always verify at OpenAI, Anthropic, Google AI, Mistral, and Together AI pricing pages before making cost decisions. Prices shown are per-request (no batching discounts).
| Model | Provider | Input $/1M | Output $/1M | Context |
|---|
Prices last verified mid-2025. Check provider pages for the latest rates.
LLM Token Calculator — FAQ
- How are tokens counted?
- We use the standard approximation of 1 token per 4 characters for English text. Actual counts depend on the model's tokenizer and can vary by plus or minus 15%.
- Which models are included?
- GPT-4o, GPT-4o mini, GPT-4 Turbo, o1, o1-mini, Claude Opus, Claude Sonnet, Claude Haiku, Gemini 2.0 Flash, Gemini 1.5 Pro, and more.
- Are prices accurate?
- Prices are updated manually and may lag behind provider changes. Always verify current pricing on the provider's official pricing page before production use.
- What is output tokens?
- Output tokens are the tokens the model generates in its response. They are typically priced 3-5x higher than input tokens because generation is more compute-intensive.
- What is a token in LLMs?
- A token is the basic unit of text that a language model processes — roughly 4 characters or 0.75 words in English. Tokens can be whole words, word fragments, or punctuation depending on the tokenizer.
- How many tokens is 1000 words?
- Approximately 1,333 tokens (1000 divided by 0.75). A typical page of English prose is around 400-600 tokens. A 100,000-token context window holds roughly 75,000 words.
- What is the context window?
- The context window is the maximum number of tokens a model can process in a single request — including both input (prompt) and output (completion). Exceeding it causes the model to truncate or reject the request.
- How do I reduce LLM API costs?
- Use a smaller model when full capability is not needed, cache repeated prompts, keep system prompts concise, use streaming to stop generation early, and batch requests where possible.
- What is prompt caching?
- Prompt caching lets providers like Anthropic and OpenAI reuse the computed state of a repeated prefix — such as a long system prompt — across requests, reducing cost and latency.
- How do I estimate cost before calling the API?
- Paste your prompt into this calculator, select your model, and it shows the estimated input token count and cost. Add your expected output token count to get the total estimated cost per call.
More Developer Tools
JSON ToolsFormat, validate, minify & compare JSON
JSON FormatterPretty-print and beautify JSON
CSV ViewerParse, view and convert CSV files
YAML ConverterConvert YAML to JSON and back
Regex TesterTest regular expressions live
Text DiffCompare two texts side by side
JWT ToolDecode, encode and verify JWTs
JWT DecoderInspect JWT header and payload
XML FormatterFormat, validate and convert XML
URL EncoderEncode, decode and parse URLs
Base64Encode and decode Base64 strings
Hash GeneratorSHA-256, MD5, HMAC and CRC32
SHA-256Generate SHA-256 checksums
MD5 GeneratorGenerate MD5 hash values
Text ToolsCase convert, count and transform text
Number ConverterBinary, hex and octal converter
Token CalculatorLLM token count and API cost
Schema GeneratorJSON to TypeScript, Go, Rust, Python