Interpreting Class Conditional GANs with Channel Awareness
Yingqing He1     Zhiyi Zhang2     Jiapeng Zhu1     Yujun Shen2     Qifeng Chen1
1HKUST      2ByteDance Inc.
Overview
This work targets understanding how a class conditional GAN manages to unify the synthesis of various classes.
  • We take a close look at the widely used class-conditional batch normalization (CCBN) layer, and observe that, followed by the ReLU activation, CCBN helps distribute the categorical information to feature channels. That says, for a particular channel, it makes varying contribution to synthesizing different categories.
  • We discover that (1) only a portion of channels are active in rendering images for a particular class while the remaining channels barely affect the generation, (2) more similar categories tend to share more relevant channels (e.g., channels regarding dog synthesis intersect with those of cats but disjoint from those of buses), and (3) some channels highly response to the latent code instead of the class embedding and hence appear to deliver knowledge to all classes.
  • Our findings enable four novel applications with class conditional GANs, including (1) single-channel image editing, (2) category hybridization, (3) fine-grained semantic segmentation, and (4) category-wise synthesis performance evaluation.
  • Application 1: Single-channel Image Editing

    Category-oriented Attributes

    Category: Boston Bull

    Mouth Ear Face Tongue

    Category: Brambling

    Body Size Head Pose Belly Feather

    Category: Volcano

    Mountain Size Ash Fire Sky

    Category: Castle

    Width Cloud Foreground Water
    Application 1: Single-channel Image editing

    Latent-oriented Attributes

    Bubble Great Grey Owl Bee Eater Speedboat Lifeboat
    Size
    Background
    Style
    Application 2: Category Hybridization

    Fuse the characteristics of two classes: mix features regarding the class-relevant channels.

    Application 3: Fine-grained Semantic Segmentation

    Segment synthesized samples: perform clustering on per-pixel features weighted by channel awareness.

    Application 4: Category-wise Synthesis Performance Evaluation

    Categories with high total channel awareness: high quality and low diversity.

    Manhole Cover Rapeseed Odometer Website Cypripedium parviflorum

    Categories with low total channel awareness: low quality and high diversity.

    Chainsaw Stretcher Reel Plastic Bag Barrow
    BibTeX
      @article{he2022interpreting,
        title   = {Interpreting Class Conditional GANs with Channel Awareness},
        author  = {He, Yingqing and Zhang, Zhiyi and Zhu, Jiapeng and Shen, Yujun and Chen, Qifeng},
        journal = {arXiv preprint arXiv:2203.11173},
        year    = {2022}
      }
    
    Related Work
    Andrew Brock, Jeff Donahue, Karen Simonyan. Large scale GAN training for high fidelity natural image synthesis. ICLR, 2019.
    Comment: Proposes a large-scale conditional GAN, i.e., BigGAN, trained on ImageNet.
    Zongze Wu, Dani Lischinski, Eli Shechtman. StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation. CVPR, 2021.
    Comment: Studies the per-channel effect of StyleGAN with the help of segmentation masks and achieves single-channel image editing.