The First International Workshop on Computational Models of the Visual Cortex: Hierarchies, Layers, Sparsity, Saliency and Attention

Research Article

A Deconvolutional Competitive Algorithm for Building Sparse Hierarchical Representations

  • @INPROCEEDINGS{10.4108/eai.3-12-2015.2262428,
        author={Dylan Paiton and Sheng Lundquist and William Shainin and Xinhua Zhang and Peter Schultz and Garrett Kenyon},
        title={A Deconvolutional Competitive Algorithm for Building Sparse Hierarchical Representations},
        proceedings={The First International Workshop on Computational Models of the Visual Cortex: Hierarchies, Layers, Sparsity, Saliency and Attention},
        publisher={ACM},
        proceedings_a={CMVC},
        year={2016},
        month={5},
        keywords={sparse coding visual cortex hierarchical model deep learning convolutional neural network deconvolution},
        doi={10.4108/eai.3-12-2015.2262428}
    }
    
  • Dylan Paiton
    Sheng Lundquist
    William Shainin
    Xinhua Zhang
    Peter Schultz
    Garrett Kenyon
    Year: 2016
    A Deconvolutional Competitive Algorithm for Building Sparse Hierarchical Representations
    CMVC
    ACM
    DOI: 10.4108/eai.3-12-2015.2262428
Dylan Paiton1,*, Sheng Lundquist2, William Shainin3, Xinhua Zhang4, Peter Schultz3, Garrett Kenyon5
  • 1: Vision Science Graduate Group, University of California, Berkeley
  • 2: Computer Science Department, Portland State University
  • 3: The New Mexico Consortium
  • 4: The University of New Mexico
  • 5: Los Alamos National Laboratory
*Contact email: dpaiton@berkeley.edu

Abstract

Sparse coding methods have been used to study how hierarchically organized representations in the visual cortex can be learned from unlabeled natural images. Here, we describe a novel Deconvolutional Competitive Algorithm (DCA), which explicitly learns non-redundant hierarchical representations by enabling competition both within and between sparse coding layers. All layers in a DCA are trained simultaneously and all layers contribute to a single image reconstruction. Because the entire hierarchy in a DCA comprises a single dictionary, there is no need for dimensionality reduction between layers, such as MAX pooling. We show that a 3-layer DCA trained on short video clips exhibits a clear segregation of image content, with features in the top layer reconstructing large-scale structures while features in the middle and bottom layers reconstruct progressively finer details. Compared to lower levels, the representations at higher levels are more invariant to the small image transformations between consecutive video frames recorded from hand-held cameras. The representation at all three hierarchical levels combine synergistically in a whole image classification task. Consistent with psychophysical studies and electrophysiological experiments, broad, low-spatial resolution image content was generated first, primarily based on sparse representations in the highest layer, with fine spatial details being filled in later, based on representations from lower hierarchical levels.