Deep Learning · Generative Models
Generative
Adversarial
Networks
Two neural networks locked in creative rivalry — one invents, one judges — until reality and simulation become indistinguishable.
What is a GAN?
Introduced by Ian Goodfellow in 2014, a Generative Adversarial Network is a framework in which two neural networks train simultaneously through competition. The Generator learns to produce convincing synthetic data while the Discriminator learns to expose fakes. Neither can rest — each improvement by one forces the other to sharpen its skill.
Generator
Takes random noise z and maps it into data space — forging images, audio, or text that could plausibly be real.
Discriminator
Receives real samples alongside the Generator’s forgeries and outputs a probability: real or fake?
How Training Works
- Sample noise. Draw a random vector z from a simple distribution (e.g., Gaussian).
- Generate. The Generator G(z) transforms the noise into a synthetic sample.
- Discriminate. The Discriminator D receives both real data and G(z), outputting scores for each.
- Update D. Maximise D’s ability to correctly label real as 1 and fake as 0.
- Update G. Minimise G’s loss — it wants D to classify its output as real.
- Repeat until G produces samples so realistic that D can do no better than random chance (≈ 0.5).
The Minimax Objective
The entire training process is captured in a single minimax game:
𝔼x~pdata[log D(x)]
+ 𝔼z~pz[log(1 − D(G(z)))]
D maximises the expectation of logging real data while G minimises the probability that D catches its fakes. At the global optimum, the Generator’s distribution pg equals the data distribution pdata.
Landmark Variants
The original GAN spawned an entire zoo of architectures. DCGAN (2015) stabilised training with convolutional layers. Conditional GAN allows class-guided generation by feeding labels to both networks. CycleGAN enables unpaired image-to-image translation without matched examples. StyleGAN (2019–2021) achieves photorealistic human faces with fine-grained style control, and BigGAN scales to class-conditional ImageNet synthesis at unprecedented fidelity.
Real-World Applications
Key Challenges
Training GANs is notoriously delicate. Mode collapse occurs when the Generator finds a small set of outputs that fool D and stops exploring. Training instability arises when one network dominates — a too-powerful D gives G vanishing gradients; a too-weak D gives no useful signal. Evaluation is subjective: the Fréchet Inception Distance (FID) is the community standard but remains imperfect. Techniques like spectral normalisation, gradient penalty (WGAN-GP), mini-batch discrimination, and progressive growing help stabilise the adversarial dance.

