Overview

Paper investigates conditional adversarial networks as a general-purpose solution to various image-to-image translation tasks, acheives modest results at a low resolution(512x512).

Method

  • Improved Objective - Adds contraints for conditional GANs
  • U-Net 'type' Generator - Explores U-Net as an alternative to traditional Encoder-Decoder architecture
  • 70x70 Patch GAN Discriminator - Evaluates in patches, saves on memory and gives comparable performance

Improved Objective

  • Additionally constraints Generator outputs on the input rather than unconstrained output from noise
  • Naive use of L2 distance without a Discriminator would yeild blurry results
  • Discriminator as a 'learnable loss' function to better distinguish and force good results from the Generator
  • Adds additional L1 distance contraint

Conditional Objective: Generate a real image with resemblance to the input (x)
Overall Objective

Generator

  • U-Net style encoder decoder architecture with skip connections

Discriminator

  • 70x70 Patch Discriminator architecture is C64-C128-C256-C512

Study done in the paper using various patch sizes for Discriminator, smaller patch size(16x16) created artifacts, 70x70 yeilded similar results when compared to using full resolution of 286x286

Results

Pure L1 loss produces blurry results, guesses 'gray' color when uncertain, cGAN loss encourages sharpness, more color, last column is with both at lambda=100 for L1 component
Map to Aerial, Aerial to Map task, pretty realistic results at 512x512
Black and White to Colorization task, cGAN component encourages color in the output
Color Distribution in outputs per different loss formulations in LAB colorspace
Road Scene to Semantic Segmentation Map output task, poor results

Application: Drawings To Sketch

tweet link

Links: