GSO-2011: David Balduzzi
Constructing a spiking currency: constrained utility optimization via wake-sleep STDP Cortical neurons receive a mixture of local and global data: patterns of spiking activity on their approximately 10,000 synapses and diuse signals from neuromodulatory systems which reflect salient outcomes experienced by the organism. Spikes form the dominant method of interneuronal communication, suggesting that understanding how neurons encode information into their spike-trains and how they use this information to collectively make decisions is an important problem. We investigate how individual neurons contribute to global brain functioning from an optimization perspective. In our model, neurons optimize local estimates of the expected utility after spiking, subject to an information-theoretic constraint related to sparsity. Utility estimates are computed using neuromodulatory signals, post-synaptic spikes and spiking feedback from downstream neurons. The constraint is imposed by adapting the infomax algorithm [1] into a method for homeostatically regulating spiking information transfer, following ideas developed in [2]. We rst consider the simplest case, where utility is defined as the number of post-synaptic spikes. The constrained optimization problem can then be approximately implemented in leaky integrate-and-re neurons via a wake-sleep spike-timing dependent plasticity (STDP) algorithm that separates learning into utility maximization (online, wake) and constraint satisfaction (offline, sleep) phases [3]. Experiments demonstrate that wake-sleep learning yields faster, more robust learning than classical STDP. Incorporating feedback spikes into utility estimates results in cooperative learning: neurons learn to elicit spikes from other neurons. If neurons spike when expected utility is maximized, then spikes align with utility. Spikes thus form a neuronal currency. The information-theoretic constraint, imposed by homeostatic regulation of synaptic weights, prevents the currency from devaluing due to overspiking ("inflation"). Finally, incorporating neuromodulatory signals into utility estimates aligns price signals in the "spiking marketplace" with real outcomes experienced by the organism. Neuronal dynamics may thus be organized around discovering and exploiting utility sources in the "spiking marketplace" provided by the cortex. References [1] Bell AJ, Sejnowski TJ: An information-maximization approach to blind separation and blind deconvolution. Neural Comput 1995, 7(6):1129{59. [2] Balduzzi D, Tononi G: Integrated Information in Discrete Dynamical Systems: Motivation and Theoretical Framework. PLoS Comput Biol 2008, 4(6):e1000091. [3] Balduzzi D, Besserve M: Submitted. |