Question 1

How are tokens counted?

Accepted Answer

We use the standard approximation of 1 token per 4 characters for English text. Actual counts depend on the model's tokenizer and can vary by plus or minus 15%.

Question 2

Which models are included?

Accepted Answer

GPT-4o, GPT-4o mini, GPT-4 Turbo, o1, o1-mini, Claude Opus, Claude Sonnet, Claude Haiku, Gemini 2.0 Flash, Gemini 1.5 Pro, and more.

Question 3

Are prices accurate?

Accepted Answer

Prices are updated manually and may lag behind provider changes. Always verify current pricing on the provider's official pricing page before production use.

Question 4

What is output tokens?

Accepted Answer

Output tokens are the tokens the model generates in its response. They are typically priced 3-5x higher than input tokens because generation is more compute-intensive.

Question 5

What is a token in LLMs?

Accepted Answer

A token is the basic unit of text that a language model processes — roughly 4 characters or 0.75 words in English. Tokens can be whole words, word fragments, or punctuation depending on the tokenizer.

Question 6

How many tokens is 1000 words?

Accepted Answer

Approximately 1,333 tokens (1000 divided by 0.75). A typical page of English prose is around 400-600 tokens. A 100,000-token context window holds roughly 75,000 words.

Question 7

What is the context window?

Accepted Answer

The context window is the maximum number of tokens a model can process in a single request — including both input (prompt) and output (completion). Exceeding it causes the model to truncate or reject the request.

Question 8

How do I reduce LLM API costs?

Accepted Answer

Use a smaller model when full capability is not needed, cache repeated prompts, keep system prompts concise, use streaming to stop generation early, and batch requests where possible.

Question 9

What is prompt caching?

Accepted Answer

Prompt caching lets providers like Anthropic and OpenAI reuse the computed state of a repeated prefix — such as a long system prompt — across requests, reducing cost and latency.

Question 10

How do I estimate cost before calling the API?

Accepted Answer

Paste your prompt into this calculator, select your model, and it shows the estimated input token count and cost. Add your expected output token count to get the total estimated cost per call.

LLM Token & Cost Calculator

LLM Token Calculator — FAQ

More Developer Tools