LLMsVideo TranslationImage GenerationVideo Generation
AI News

Independent coverage of the latest AI tool updates, releases, and comparisons.

Categories

  • AI LLMs
  • AI Video Translation
  • AI Image Generation
  • AI Video Generation

Company

  • About
  • Contact

Resources

  • Sitemap
  • AI Glossary
  • Tool Comparisons
  • Facts / Grounding
  • llms.txt
  • XML Sitemap
© 2026 AI News. Independent editorial coverage. Not affiliated with any AI company.
Home/Glossary/Transformer
AI LLMs

Transformer

Definition

The transformer is a deep learning architecture introduced in the 2017 paper 'Attention Is All You Need' that relies entirely on self-attention mechanisms. It replaced recurrent neural networks as the dominant architecture for language tasks and now underpins virtually every major AI model. Transformers process input tokens in parallel, enabling much faster training on modern GPUs.

How It Works

Transformers use multi-head self-attention to weigh the importance of each token relative to every other token in a sequence. The architecture consists of encoder and decoder stacks, each containing attention layers and feed-forward networks with residual connections. Positional encodings are added to input embeddings since the architecture has no inherent sense of token order.

Key Tools

GPT (OpenAI)Industry-leading large language models powering ChatGPT
$20/mo (ChatGPT Plus)
Claude (Anthropic)Safe, helpful AI assistant with extended context and reasoning
$20/mo (Pro)
Gemini (Google)Google's multimodal AI model family
$19.99/mo (Advanced)
Llama (Meta)Open-source large language models from Meta
Free (open source)
MistralEuropean AI lab building efficient open and commercial LLMs
Usage-based API

Related Terms

Large Language Model (LLM)Fine-TuningDiffusion Model
← Back to AI Glossary