2D Spatial Pyramid Pooling Forward Layer

The forward two-dimensional (2D) spatial pyramid pooling layer with pyramid height LN is a form of non-linear downsampling of an input tensor X . The library supports four-dimensional input tensors XR n 1 x n 2 x n 3 x n 4 . 2D spatial pyramid pooling partitions the input tensor data into (2 l )2 subtensors/bins, l ∈ {0, ..., L-1}, along dimensions k 1 and k 2 and computes the result in each subtensor. The computation is done according to the selected pooling strategy: maximum, average, or stochastic. The spatial pyramid pooling layer applies the pooling L times with different kernel sizes, strides, and paddings.

Problem Statement

The library provides several spatial pyramid pooling layers:

  • Spatial pyramid maximum pooling

  • Spatial pyramid average pooling

  • Spatial pyramid stochastic pooling

The following description applies to each of these layers.

Let XR n 1 x n 2 x n 3 x n 4 be the tensor of input data and k 1 and k 2 be the dimensions along which kernels are applied. Without loss of generality k 1 and k 2 are the last dimensions of the tensor X . For each level l ∈ {0, ..., L-1} and number of bins b = 2 l , the layer applies 2D pooling with parameters:

  • Kernel sizes

  • Strides s i = m i

  • Paddings

In the layout flattened along the dimension n', the layer result is represented as a two-dimensional tensor YR n 1 x n' , where .

The following figure illustrates the behavior of the spatial pyramid maximum pooling forward layer with pyramid height L = 2:
Spatial Pyramid Pooling Forward Layer

For more complete information about compiler optimizations, see our Optimization Notice.
Select sticky button color: 
Orange (only for download buttons)