Generative Adversarial Networks

Deep Learning · Generative Models

Generative
Adversarial
Networks

Two neural networks locked in creative rivalry — one invents, one judges — until reality and simulation become indistinguishable.

What is a GAN?

Introduced by Ian Goodfellow in 2014, a Generative Adversarial Network is a framework in which two neural networks train simultaneously through competition. The Generator learns to produce convincing synthetic data while the Discriminator learns to expose fakes. Neither can rest — each improvement by one forces the other to sharpen its skill.

🎨

Generator

Takes random noise z and maps it into data space — forging images, audio, or text that could plausibly be real.

VS
🔍

Discriminator

Receives real samples alongside the Generator’s forgeries and outputs a probability: real or fake?

How Training Works

  • Sample noise. Draw a random vector z from a simple distribution (e.g., Gaussian).
  • Generate. The Generator G(z) transforms the noise into a synthetic sample.
  • Discriminate. The Discriminator D receives both real data and G(z), outputting scores for each.
  • Update D. Maximise D’s ability to correctly label real as 1 and fake as 0.
  • Update G. Minimise G’s loss — it wants D to classify its output as real.
  • Repeat until G produces samples so realistic that D can do no better than random chance (≈ 0.5).

The Minimax Objective

The entire training process is captured in a single minimax game:

Objective Function minG maxD V(D, G) =
  𝔼x~pdata[log D(x)]
 + 𝔼z~pz[log(1 − D(G(z)))]

D maximises the expectation of logging real data while G minimises the probability that D catches its fakes. At the global optimum, the Generator’s distribution pg equals the data distribution pdata.

Landmark Variants

The original GAN spawned an entire zoo of architectures. DCGAN (2015) stabilised training with convolutional layers. Conditional GAN allows class-guided generation by feeding labels to both networks. CycleGAN enables unpaired image-to-image translation without matched examples. StyleGAN (2019–2021) achieves photorealistic human faces with fine-grained style control, and BigGAN scales to class-conditional ImageNet synthesis at unprecedented fidelity.

Real-World Applications

🖼️Image Synthesis
🎭Deepfake Video
💊Drug Discovery
🎵Music Generation
👗Fashion Design
🏥Medical Imaging
🗺️Super Resolution
✍️Text-to-Image

Key Challenges

Training GANs is notoriously delicate. Mode collapse occurs when the Generator finds a small set of outputs that fool D and stops exploring. Training instability arises when one network dominates — a too-powerful D gives G vanishing gradients; a too-weak D gives no useful signal. Evaluation is subjective: the Fréchet Inception Distance (FID) is the community standard but remains imperfect. Techniques like spectral normalisation, gradient penalty (WGAN-GP), mini-batch discrimination, and progressive growing help stabilise the adversarial dance.

Generative Adversarial Networks  ·  Goodfellow et al., 2014  ·  Deep Learning

Leave a Reply

Your email address will not be published. Required fields are marked *