InterClassGAN

Yingqing He¹ Zhiyi Zhang² Jiapeng Zhu¹ Yujun Shen² Qifeng Chen¹

¹HKUST ²ByteDance Inc.

[Paper] [Code]

Overview

This work targets understanding how a class conditional GAN manages to unify the synthesis of various classes.

We take a close look at the widely used class-conditional batch normalization (CCBN) layer, and observe that, followed by the ReLU activation, CCBN helps distribute the categorical information to feature channels. That says, for a particular channel, it makes varying contribution to synthesizing different categories.

We discover that (1) only a portion of channels are active in rendering images for a particular class while the remaining channels barely affect the generation, (2) more similar categories tend to share more relevant channels (e.g., channels regarding dog synthesis intersect with those of cats but disjoint from those of buses), and (3) some channels highly response to the latent code instead of the class embedding and hence appear to deliver knowledge to all classes.

Our findings enable four novel applications with class conditional GANs, including (1) single-channel image editing, (2) category hybridization, (3) fine-grained semantic segmentation, and (4) category-wise synthesis performance evaluation.

Application 1: Single-channel Image Editing

Category-oriented Attributes

Category: Boston Bull

Mouth	Ear	Face	Tongue

Category: Brambling

Body Size	Head Pose	Belly	Feather

Category: Volcano

Mountain Size	Ash	Fire	Sky

Category: Castle

Width	Cloud	Foreground	Water

Application 1: Single-channel Image editing

Latent-oriented Attributes

	Bubble	Great Grey Owl	Bee Eater	Speedboat	Lifeboat
Size
Background
Style

Application 2: Category Hybridization

Fuse the characteristics of two classes: mix features regarding the class-relevant channels.

Application 3: Fine-grained Semantic Segmentation

Segment synthesized samples: perform clustering on per-pixel features weighted by channel awareness.

Application 4: Category-wise Synthesis Performance Evaluation

Categories with high total channel awareness: high quality and low diversity.

Manhole Cover	Rapeseed	Odometer	Website	Cypripedium parviflorum

Categories with low total channel awareness: low quality and high diversity.

Chainsaw	Stretcher	Reel	Plastic Bag	Barrow

BibTeX

  @article{he2022interpreting,
    title   = {Interpreting Class Conditional GANs with Channel Awareness},
    author  = {He, Yingqing and Zhang, Zhiyi and Zhu, Jiapeng and Shen, Yujun and Chen, Qifeng},
    journal = {arXiv preprint arXiv:2203.11173},
    year    = {2022}
  }

Related Work

Andrew Brock, Jeff Donahue, Karen Simonyan. Large scale GAN training for high fidelity natural image synthesis. ICLR, 2019.
Comment: Proposes a large-scale conditional GAN, i.e., BigGAN, trained on ImageNet.

Zongze Wu, Dani Lischinski, Eli Shechtman. StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation. CVPR, 2021.
Comment: Studies the per-channel effect of StyleGAN with the help of segmentation masks and achieves single-channel image editing.

Yuxuan Zhang, Huan Ling, Jun Gao, Kangxue Yin, Jean-Francois Lafleche, Adela Barriuso, Antonio Torralba, Sanja Fidler. DatasetGAN: Efficient labeled data factory with minimal human effort. CVPR, 2021.
Comment: Equips StyleGAN with an auxiliary branch for fine-grained semantic segmentation using a few annotated examples.