关键词:
Computer science
摘要:
Deep convolutional neural networks (CNNs) trained on visual objects have shown intriguing ability to predict some response properties of visual cortical neurons. However, the factors (e.g., if the model is trained or not, receptive field size etc.) and computations (e.g., convolution, rectification, pooling, normalization etc.) that give rise to such ability, at what level, and the role of intermediate processing stages in explaining changes that develop across areas of the cortical hierarchy, are poorly understood. We focused on the sensitivity to textures as a paradigmatic example, since recent neurophysiology experiments provide rich data pointing to texture sensitivity in secondary (but not primary) visual cortex (V2). We initially explored the CNN without any fitting to the neural data, and found that the first two layers of the CNN showed qualitative correspondence to the first two cortical areas in terms of texture sensitivity. After we have shown a qualitative similarity between the CNN models and the brain V2 data, we sought to develop a approach quantify this correspondence. We developed methods to select same number of CNN model neurons that best fits the brain neural recordings. We found that the CNN could develop compatibility to secondary cortex in the second layer following rectification, and that this was improved following pooling, but only mildly influenced by the local normalization operation. Higher layers of the CNN could further, though modestly, improve the compatibility with the V2 data. The compatibility was reduced when incorporating random rather than learned weights. We also considered several other popular models in the literature such as Hmax and ScatNet. Hmax is a model of visual cortex that builds an increasingly complex and invariant feature representation in the higher layers. ScatNet is a specialized model for texture discrimination by incorporating higher order moments. We found that only the second layer in both the models has so