Alex Cayco-Gajic: Learning Dynamics of Distributed Neural Networks

École Normale Superieure, Paris

Understanding how distributed networks of neurons learn is a core question in both neuroscience and machine learning. While machine learning has offered significant insight into the learning dynamics of gradient descent in homogeneous networks, several challenges remain to apply this intuition towards learning in messy, heterogeneous neural circuits. In this talk, I will present two perspectives on this question. First, unlike the homogeneous architectures typically used in machine learning, biological learning relies on coordination across interconnected brain areas -- each with their own roles, architectures, and learning rules. However, we currently lack theoretical frameworks for how learning might be distributed across the brain. We have recently proposed a modular, multi-area model that combines a recurrent controller network, which stores dynamic motor memories, with a feedforward adapter network that responds rapidly to environmental perturbations. In this architecture, the adapter learns an internal error signal that can be used to drive online correction of motor output, while simultaneously tutoring memory consolidation in the recurrent module. Based on structural features, we propose a new role for cerebello-cortical interactions during motor adaptation.

In the second part of the talk, I will turn to role of multiple populations within a single region. Gradient-based algorithms are a cornerstone of artificial neural network training, yet it remains unclear whether biological neural networks use similar gradient-based strategies during learning. Experiments often reveal a diversity of synaptic plasticity rules, but whether these amount to an approximation of gradient descent is still an open question. Here, we investigate a previously overlooked possibility: that learning dynamics may include fundamentally non-gradient “curl”-like components while still effectively optimizing a loss function. Towards this end, we investigate the learning dynamics of feedforward networks while systematically introducing non-gradient dynamics by incorporating a second population of neurons characterized by sign-flipped plasticity rules. Surprisingly, our results identify specific architectures where curl terms can actually speed learning compared to gradient descent by helping to escape saddles. Our results offer a unique counterpoint to normative theories of gradient-based learning in biological and artificial networks.

 

Guests are welcome!

 

Organized by

Simone Ciceri & Henning Sprekeler

Location: BCCN Berlin, lecture hall 9, Philippstr. 13 Haus 6, 10115 Berlin

Go back