1. Introduction
  2. Methods
  3. Results
    1. Comparison between priors trained to invert features from different layers
    2. Does the learned prior trained on ImageNet generalize to other datasets?
    3. Does the learned prior generalize to visualizing different architectures?
    4. Does the learned prior generalize to visualizing hidden neurons?
    5. Do the synthesized images teach us what the neurons prefer or what the prior prefers?
    6. Other applications of our proposed method
  4. Discussion and Conclusion

(NIPS 2016) Synthesizing the preferred inputs for neurons in neural networks via deep generator networks
Paper: http://www.evolvingai.org/files/nguyen2016synthesizing.pdf
Code: https://github.com/Evolving-AI-Lab/synthesizing

Introduction

As the neuroscientists did, one could simply show the network a large set of images and record a set of images that highly activate a neuron [2]. However, that method has disadvantages vs. synthesizing preferred stimuli:

  1. it requires a distribution of images that are similar to those used to train the network, which may not be known (e.g. when probing a trained network when one does not know which data were used to train it);

  2. even in such a dataset, many informative images that would activate the neuron may not exist because the image space is vast [3];

  3. with real images, it is unclear which of their features a neuron has learned: for example, if a neuron is activated by a picture of a lawn mower on grass, it is unclear if it 'cares about' the grass, but if an image synthesized to highly activate the lawn mower neuron contains grass (as in Fig. 1., we can be more confident the neuron has learned to pay attention to that context.

Synthesizing preferred stimuli is called activation maximization [4–8, 3, 9].

the set of all possible images is so vast that it is possible to produce 'fooling' images that excite a neuron, but do not resemble the natural images that neuron has learned to detect.

Many hand-designed natural image priors have been experimentally shown to improve image quality such as: Gaussian blur [7], α-norm [5, 7, 8], total variation [6, 9], jitter [10, 6, 9], data-driven patch priors [8], center-bias regularization [9], and initializing from mean images [9].

typically limited to relatively low-dimensional images and narrowly focused datasets.

The image generator DNN that we use as a prior is trained to take in a code (e.g. vector of scalars) and output a synthetic image that looks as close to real images.

Our method restricts the search to only the set of images that can be drawn by the prior, which provides a strong biases toward realistic visualizations.

Because our algorithm uses a deep generator network to perform activation maximization, we call it DGN-AM.

Methods

Networks that we visualize. We demonstrate our visualization method on a variety of different networks.

Image generator network. we optimize in the input code of an image generator network \(G\) such that \(G\) outputs an image that highly activates \(h\).

The training process involves four convolutional networks:

  1. a fixed encoder network E to be inverted

  2. a generator network G

  3. a fixed “comparator” network C

  4. a discriminator D

Synthesizing the preferred images for a neuron. Intuitively, we search in the input code space of the image generator model \(G\) to find a code \(y\) such that \(G(y)\) is an image that produces high activation of the target neuron \(h\) in the DNN \(\Phi\) that we want to visualize.

the true goal of activation maximization is to generate interpretable preferred stimuli for each neuron.

Figure 2

Figure 2: To synthesize a preferred input for a target neuron h (e.g. the “candle” class output neuron), we optimize the hidden code input (red bar) of a deep image generator network (DGN) to produce an image that highly activates h. In the example shown, the DGN is a network trained to invert the feature representations of layer fc6 of CaffeNet. The target DNN being visualized can be a different network (with a different architecture and or trained on different data). The gradient information (blue-dashed line) flows from the layer containing h in the target DNN (here, layer fc8) all the way through the image back to the input code layer of the DGN. Note that both the DGN and target DNN being visualized have fixed parameters, and optimization only changes the DGN input code (red).

Results

Comparison between priors trained to invert features from different layers

Figure 1

Figure 1: Images synthesized from scratch to highly activate output neurons in the CaffeNet deep neural network, which has learned to classify different types of ImageNet images.

Does the learned prior trained on ImageNet generalize to other datasets?

Figure 3

Figure 3: Preferred stimuli for output units of an AlexNet DNN trained on the MIT Places 205 dataset, showing that the ImageNet-trained prior generalizes well to a dataset comprised of images of scenes.

Does the learned prior generalize to visualizing different architectures?

Does the learned prior generalize to visualizing hidden neurons?

Do the synthesized images teach us what the neurons prefer or what the prior prefers?

Other applications of our proposed method

Figure S12

Figure S12: Visualizations of optimizing an image that activates two neurons at the same time. Top panel: the visualizations of activating single neurons. Bottom panel: the visualizations of activating “candles” neuron and a corresponding neuron shown in the top panel. In other words, this method can be a novel way for generating art images for the image generation domain, and also can be used to uncover new types of preferred images for a neuron, shedding more light into what it does (here are ∼30 images that activate the same “candles” neuron).

Discussion and Conclusion