Token

Definition & meaning

Definition

A token is the fundamental unit of text that language models process, representing a piece of a word, a whole word, or a punctuation mark. LLMs break input text into tokens before processing, and pricing for API-based models is typically calculated per token consumed. For English text, one token is roughly 3-4 characters or about 0.75 words. Token limits define the maximum context window an LLM can handle in a single request.

How It Works

In the context of LLMs, a token is the fundamental unit of text that the model processes. Before any text reaches the neural network, a tokenizer breaks it into tokens using algorithms like Byte-Pair Encoding (BPE) or SentencePiece. These algorithms learn a vocabulary of common subword units from training data—typically 32,000 to 128,000 tokens. Common words like "the" become single tokens, while rare words get split into subword pieces: "tokenization" might become ["token", "ization"]. Each token maps to an integer ID, which the model uses internally. A rough rule of thumb: one token is approximately 3/4 of a word in English, so 1,000 tokens cover about 750 words. Every API call is billed by token count (input tokens + output tokens), and model context windows are measured in tokens. The tokenizer is model-specific—GPT-4 uses cl100k_base, Claude uses its own tokenizer, and Llama uses SentencePiece—so the same text may produce different token counts across models.

Why It Matters

Tokens are the currency of the LLM economy. Every API call you make is priced per token, every context window has a token limit, and every latency measurement correlates with token count. Understanding tokenization helps you estimate costs accurately, stay within context limits, and optimize performance. A developer who knows that code is tokenized less efficiently than prose (more tokens per character due to special characters and formatting) can make better architectural decisions. Token awareness also explains why models sometimes split words oddly or struggle with character-level tasks like counting letters—they literally don't see individual characters, only token chunks.

Real-World Examples

OpenAI's tiktoken library lets you count tokens programmatically before making API calls. Anthropic publishes token pricing for Claude models—for instance, Claude Sonnet costs $3 per million input tokens. Google's Gemini models use a 1-million-token context window, the largest commercially available. The Hugging Face tokenizers library supports virtually every open-source model's tokenizer. On ThePlanetTools.ai, we factor token pricing into every AI tool review—comparing cost-per-token across providers like OpenAI, Anthropic, Google, and open-source alternatives hosted on Together AI or Fireworks AI.