GPT
Simple Definition
GPT stands for Generative Pre-trained Transformer. It’s the family of AI models developed by OpenAI that powers ChatGPT. Each word in the name describes how it works:
- Generative — it generates new text
- Pre-trained — it was trained on a large dataset before being deployed
- Transformer — it uses the transformer neural network architecture
The GPT Model Line
| Model | Released | Key Milestone |
|---|---|---|
| GPT-1 | 2018 | Proof of concept |
| GPT-2 | 2019 | So capable OpenAI initially withheld it |
| GPT-3 | 2020 | Shocked the world with text quality |
| GPT-3.5 | 2022 | Powered the original ChatGPT |
| GPT-4 | 2023 | Multimodal, much more capable |
| GPT-4o | 2024 | Faster, cheaper, natively multimodal |
GPT vs. ChatGPT
GPT is the underlying model — the AI technology itself.
ChatGPT is the product — the chat interface that lets people interact with GPT models.
When people say “I use ChatGPT,” they mean the product. The technology running underneath it is GPT (or increasingly GPT-4o and its variants).
How GPT Models Work
GPT models are trained to predict the next token in a sequence. After training on billions of documents from the internet, books, and code, this basic task generalizes into the ability to write essays, answer questions, write code, and reason through complex problems.
Related Terms
- LLM — GPT models are LLMs
- Transformer — the architecture GPT is built on
- Foundation Model — GPT models are foundation models
- Generative AI — GPT is the leading example of generative text AI
See AI terms in action
Browse practical AI workflows that use the concepts in this glossary.
Last updated: