Technical Program

Paper Detail

Paper: PS-2A.34
Session: Poster Session 2A
Location: Symphony/Overture
Session Time: Friday, September 7, 17:15 - 19:15
Presentation Time:Friday, September 7, 17:15 - 19:15
Presentation: Poster
Publication: 2018 Conference on Cognitive Computational Neuroscience, 5-8 September 2018, Philadelphia, Pennsylvania
Paper Title: Beware of the beginnings: intermediate and higher-level representations in deep neural networks are strongly affected by weight initialization
Manuscript:  Click here to view manuscript
Authors: Johannes Mehrer, University of Cambridge, United Kingdom; Nikolaus Kriegeskorte, Columbia University, United States; Tim C. Kietzmann, University of Cambridge, United Kingdom
Abstract: Deep neural networks (DNNs) excel at complex visual recognition tasks and have successfully been used as models of visual processing in the primate brain. Because network training is computationally expensive, many computational neuroscientists rely on pre-trained networks. Yet, it is unclear in how far the obtained results will generalize, as different weight initializations might shape the learned features (despite reaching similar testing performance). Here we estimate the effects of such initialization while keeping the network architecture and training sequence identical. To investigate the learned representations, we use representational similarity analysis (RSA), a technique borrowed from neuroscience. RSA characterizes a network’s internal representations by estimating all pairwise distances across a large set of input conditions – an approach that is invariant to rotations of the underlying high-dimensional activation space. Our results indicate that differently initialized DNNs trained on the same task converged on indistinguishable performance levels, but substantially differed in their intermediate and higher-level representations. This poses a potential problem for comparing representations across networks and neural data. As a path forward, we show that biologically motivated constraints, such as Gaussian noise and rate-limited tanh activation functions can substantially improve the reliability of learned representations.