What Are AI Tokens? A Simple Explanation
Updated June 2026 · 5 min read
If you've ever hit a "context limit" error in ChatGPT, wondered why your API bill was higher than expected, or tried to figure out how much text a model can handle — you were dealing with tokens.
Here's a plain-English breakdown of what tokens are, why they matter, and how to work with them.
What is a token?
A token is the unit AI language models use to process text. Instead of reading character-by-character or word-by-word, models split text into overlapping chunks called tokens.
A useful rule of thumb for English:
- 1 token ≈ 4 characters
- 1 token ≈ ¾ of a word
- 100 tokens ≈ 75 words
- 1,000 words ≈ 1,300–1,500 tokens
Common words like "the", "and", or "is" are usually one token each. Rare or long words get split across multiple tokens. Spaces, punctuation, and code symbols each count too.
Why do tokens matter?
Two reasons: context limits and API cost.
1. Context window limits
Every AI model has a maximum context window — the total number of tokens it can "see" at once, including both your input and its output. If you exceed it, the model either stops responding or forgets the beginning of the conversation.
Current context limits (as of 2026):
- GPT-4o: 128,000 tokens (~96,000 words)
- Claude 3.5 Sonnet: 200,000 tokens (~150,000 words)
- Gemini 1.5 Pro: 1,000,000 tokens (~750,000 words)
2. API pricing
When you use AI models via API, you're billed per 1,000 tokens (often written as per 1K or per 1M tokens). Input tokens (what you send) and output tokens (what the model generates) are priced separately, with output usually costing more.
Example: GPT-4o at $2.50 / 1M input tokens means sending a 1,000-word prompt costs roughly $0.003 — tiny for one-off use, but it adds up fast at scale.
Tokens across different models
Different models use different tokenizers, so the same text may produce slightly different token counts. The 4-chars-per-token estimate works well for English across all major models. Code and non-English text can be significantly more token-dense.
How to count tokens
You don't need to count manually. Use our free AI Token Calculator — paste your text and instantly see the token count and estimated API cost for GPT-4o, Claude, and Gemini.
For precise counts in code, use OpenAI's tiktoken library (Python) or the @anthropic-ai/sdk countTokens method for Claude.
Quick reference: token counts
- A tweet (280 chars) ≈ 70 tokens
- A 500-word blog post ≈ 650 tokens
- A full novel (80,000 words) ≈ 100,000–120,000 tokens
- A typical system prompt (200 words) ≈ 260 tokens
Paste any text into our free Token Calculator to see the count and cost estimate instantly.