Bestseller #1

Artificial Intelligence for Dummies

₹688

Buy on Amazon

Bestseller #2

Meta AI Unleashed: The cutting-edge artificial intelligence from …

Buy on Amazon

Bestseller #4

META AI (Italian Edition)

Buy on Amazon

Pretrained Models

Foundation Models · Deep Learning

The Pretrained Models
Shaping Modern AI

A visual guide to GPT, BERT, LLaMA, and Claude — the transformer-based architectures that redefined what language models can do.

🧠

GPT Series

OpenAI · 2018 – present

Autoregressive Decoder-only Generative Causal LM

The Generative Pre-trained Transformer family pioneered large-scale unsupervised pretraining on internet text followed by task-specific fine-tuning. GPT-3 (175 B params) demonstrated that scale alone unlocks emergent few-shot abilities, while GPT-4 introduced multimodal reasoning and RLHF alignment.

175BGPT-3 params

2018First release

96LGPT-3 layers

🔍

BERT

Google · 2018

Bidirectional Encoder-only MLM NSP

Bidirectional Encoder Representations from Transformers changed NLP benchmarks overnight. By masking random tokens and training the model to predict them using left and right context simultaneously, BERT produced deeply contextual embeddings ideal for classification, NER, QA, and semantic search.

340MLarge params

2018Released

24LLarge layers

🦙

LLaMA Series

Meta AI · 2023 – present

Open-weights Decoder-only RoPE GQA

Large Language Model Meta AI democratised foundation-model research by releasing competitive weights publicly. LLaMA 2 added grouped-query attention for efficiency; LLaMA 3 extended context to 128 K tokens and trained on 15 T tokens. Its open availability spurred thousands of fine-tunes and derivative models.

405BLlama 3 max

128KContext window

15TTraining tokens

✦

Claude Series

Anthropic · 2023 – present

Constitutional AI RLHF Safety-first Long context

Built around Constitutional AI — a method that uses a set of principles to guide self-critique and revision — Claude prioritises helpfulness, harmlessness, and honesty. Claude 3 Opus matched or exceeded GPT-4 on many benchmarks; the Claude 3.5 and 4 families extended multimodal reasoning and tool use.

200KContext tokens

2023First release

CAIAlignment method

Architecture at a Glance

Model	Architecture	Training objective	Best for
GPT	Decoder-only transformer	Next-token prediction (CLM)	Open-ended generation, chat, code
BERT	Encoder-only transformer	Masked LM + Next sentence pred.	Classification, NER, semantic search
LLaMA	Decoder-only (RoPE + GQA)	Next-token prediction (CLM)	Open research, fine-tuning, edge deploy
Claude	Decoder-only + Constitutional AI	RLHF + CAI self-critique	Long-context reasoning, safe assistants

A Brief History

2017

Attention Is All You Need — Vaswani et al. introduce the Transformer, replacing recurrent nets with pure self-attention, laying the foundation for every model on this page.
2018

GPT-1 & BERT — OpenAI’s GPT shows unsupervised pretraining + fine-tuning wins at NLU. Google’s BERT simultaneously proves bidirectional context is king for understanding tasks.
2020

GPT-3 — 175 B parameters and in-context few-shot learning stun the research community. Scale, it turns out, is a feature.
2023

LLaMA 1 & Claude 1 — Meta opens the weights to researchers; Anthropic ships Constitutional AI-aligned Claude. The open/closed dichotomy defines a new era of LLM competition.
2024 – 25

Claude 3 / 4, LLaMA 3, GPT-4o — Multimodal reasoning, 128 K–1 M token contexts, tool use, and real-time voice. The frontier accelerates faster than ever.

Bestseller #1

Generative AI for Everyone: Deep learning, NLP, and LLMs for crea…

₹902

Buy on Amazon

Bestseller #2

Generative Ai for Web Engineering Models

₹23,559

Buy on Amazon

Bestseller #3

Advanced Interdisciplinary Applications of Large Language Models …

₹16,035

Buy on Amazon

Bestseller #4

Building Generative AI Models with PyTorch: Advancing Innovation …

₹2,500

Buy on Amazon

Bestseller #5

The AI Dictionary Including Large Language Model Terms : A Book o…

Buy on Amazon

Pretrained Language Models Explained: GPT, BERT, LLaMA & Claude — The Transformers Shaping Modern AI

Artificial Intelligence for Dummies

META AI

Meta AI Unleashed: The cutting-edge artificial intelligence from …

META AI (Italian Edition)

The Pretrained Models
Shaping Modern AI

Architecture at a Glance

A Brief History

Generative AI for Everyone: Deep learning, NLP, and LLMs for crea…

Generative Ai for Web Engineering Models

Advanced Interdisciplinary Applications of Large Language Models …

Building Generative AI Models with PyTorch: Advancing Innovation …

The AI Dictionary Including Large Language Model Terms : A Book o…

By Somish Saipar

Leave a Reply Cancel reply

Oops, looks like this got skipped!

Managing Output Parsers for Structured Data Extraction: A Complete Developer Guide

Graceful Error Handling & Retry Patterns | Resilient Web UI with Animated Gradient Background

Ensuring Safety and Security in Tool Execution: A Complete Guide for AI Systems

Architecting Robust Tool Interfaces and API Integrations: Patterns, Principles & Best Practices

Artificial Intelligence for Dummies

META AI

Meta AI Unleashed: The cutting-edge artificial intelligence from …

META AI (Italian Edition)

Architecture at a Glance

A Brief History

Generative AI for Everyone: Deep learning, NLP, and LLMs for crea…

Generative Ai for Web Engineering Models

Advanced Interdisciplinary Applications of Large Language Models …

Building Generative AI Models with PyTorch: Advancing Innovation …

The AI Dictionary Including Large Language Model Terms : A Book o…

By Somish Saipar

Related Post

Leave a Reply Cancel reply

Oops, looks like this got skipped!