Rethinking Atrous Convolution for Semantic Image Segmentation
Paper: https://arxiv.org/abs/1706.05587

在这篇论文，作者在 deeplab 的基础上重新研究了 atrous convolution，并提出了 cascaded module 和 parallel module 两种结构。

中文

简介

对于语义分割，考虑使用DCNN的两个挑战：

For the task of semantic segmentation, we consider two challenges in applying Deep Convolutional Neural Networks (DCNNs).

this invariance to local image transformation may impede dense prediction tasks, where detailed spatial information is desired.

With atrous convolution, also known as dilated convolution, is able to control the resolution at which feature responses are computed within DCNNs without requiring learning extra parameters.
the existence of objects at multiple scales.

Several methods have been proposed to handle the problem and we mainly consider four categories:
1. Image Pyramid
  
  an image pyramid to extract features for each scale input where objects at different scales become prominent at different feature maps.
2. Encoder-Decoder
  
  exploits multi-scale features from the encoder part and recovers the spatial resolution from the decoder part.
3. Context module
  
  extra modules are cascaded on top of the original network for capturing long range information.
4. Spatial Pyramid Pooling
  
  spatial pyramid pooling probes an incoming feature map with filters or pooling operations at multiple rates and multiple effective field-of-views.