How neural networks learn representation: a random matrix theory perspective

Upcoming Events

Topic:	How neural networks learn representation: a random matrix theory perspective
Date:	28/03/2023
Time:	3:00 pm - 4:00 pm
Venue:	Lady Shaw Building C3
Category:	Seminars
Speaker:	MR. Denny Wu
Details:	Abstract Random matrix theory (RMT) provides powerful tools to characterize the performance of random neural networks (at i.i.d. initialization) in high dimensions. However, it is not clear if such tools can be applied to trained neural networks where the parameters are no longer i.i.d. due to gradient-based learning. In this work we use RMT to precisely quantify the benefit of feature (representation) learning in the “early phase” of gradient descent training. We consider a two-layer neural network in the proportional asymptotic limit, and compute the asymptotic prediction risk of kernel ridge regression on the learned neural network representation. Our results demonstrate that feature learning can lead to considerable advantage over the initial random features model (and possibly a wide range of fixed kernels), and highlight the role of learning rate scaling in the initial phase of training. Joint work with Jimmy Ba, Murat A. Erdogdu, Taiji Suzuki, Zhichao Wang, Greg Yang.

The Chinese University of Hong Kong