Context Window

Simple Definition

A context window is the maximum amount of text (measured in “tokens”) that an AI model can process at one time. Think of it as the AI’s short-term memory.

If your conversation or document exceeds the context window, the AI starts to “forget” earlier content.

What Are Tokens?

Tokens are chunks of text — roughly 3-4 characters or about 0.75 words. A 1,000-word document is approximately 1,333 tokens.

Context windows are measured in tokens rather than words because that’s how AI models process text internally.

Why Context Windows Matter

For long documents: If you paste a very long document into an AI tool and the document exceeds the context window, the AI may not see all of it.

For long conversations: In a very long back-and-forth conversation, earlier messages may fall outside the context window, causing the AI to “forget” what was said earlier.

For multi-step tasks: AI agents doing complex tasks need enough context window to hold all the relevant information without losing track.

Context Windows Have Grown Dramatically

Early LLMs had small context windows (a few thousand tokens). Modern models handle much more:

  • Claude 3.5+ can handle 200,000+ tokens (roughly 150,000 words)
  • GPT-4 handles 128,000 tokens
  • Gemini 1.5 Pro handles 1 million tokens in some configurations

Larger context windows let you paste entire books, codebases, or very long documents for analysis.

Practical Implications

  • Pasting long documents: Use models with large context windows (like Claude) for analyzing lengthy content
  • Long conversations: For very long sessions, periodically summarize the conversation to stay within context
  • Using RAG: For enterprise applications with massive data, RAG is a technique to work around context limitations
  • LLM — the AI models that have context windows
  • RAG — technique for connecting AI to external data beyond the context window

See AI terms in action

Browse practical AI workflows that use the concepts in this glossary.

Last updated: