91探花

Condensation sheds light to the mathematical foundation of deep neural networks

发布者:曹玲玲发布时间:2025-10-26浏览次数:10

报告人:张耀宇 副教授 上海交通大学

主持人:耿新

报告时间:2025年10月31日(周五)下午14:00-15:00

报告地点:91探花 九龙湖校区计算机楼513报告厅

报告摘要:Condensation (also known as quantization, clustering, or alignment) is a widely observed phenomenon where neurons in the same layer tend to align with one another during the nonlinear training of deep neural networks (DNNs). It is a key characteristic of the feature learning process of neural networks. In recent years, to advance the mathematical understanding of condensation, we uncover structures regarding the dynamical regime, loss landscape and generalization for deep neural networks, based on which a novel theoretical framework emerges. This presentation will cover these findings in detail. First, I will present results regarding the dynamical regime identification of condensation at the infinite width limit, where small initialization is crucial. Then, I will discuss the mechanism of condensation at the initial training stage and the global loss landscape structure underlying condensation in later training stages, highlighting the prevalence of condensed critical points and global minimizers. Finally, I will present results on the quantification of condensation and its generalization advantage, which includes a novel estimate of sample complexity in the best-possible scenario. These results underscore the effectiveness of the phenomenological approach to understanding DNNs, paving a way for further developing deep learning theory.

报告人简介:张耀宇,上海交通大学自然科学研究院/数学科学学院长聘教轨副教授。2012年于上海交通大学致远学院获物理学士学位。2016年于上海交通大学获数学博士学位。2016年至2020年,分别在纽约大学阿布扎比分校&柯朗研究所、普林斯顿高等研究院做博士后研究。他的研究聚焦于深度学习的基础理论。

  • 联系方式
  • 通信地址:南京市江宁区91探花 路2号91探花 九龙湖校区计算机学院
  • 邮政编码:211189
  • ​办公地点:91探花 九龙湖校区计算机楼
  • 学院微信公众号