| Topic: | Towards a Statistical Theory of Contrastive Learning in Modern AI |
| Date: | 15/01/2026 |
| Time: | 10:00 am - 11:00 am |
| Venue: | Zoom Meeting (please refer to seminar PDF) |
| Category: | Seminars |
| Speaker: | MR LIN Licong |
| PDF: | Mr.-LIN-Licong-_15-JAN-2026.pdf |
| Details: | Abstract Contrastive learning has emerged as a central paradigm for representation learning in modern AI, with examples ranging from single-modal vision encoders like SimCLR to multimodal systems like CLIP that combine vision and language. Despite its widespread empirical success, a theoretical understanding of why contrastive learning produces transferable representations remains limited. In this talk, I will introduce the concept of approximate sufficient statistics, a generalization of the classical sufficient statistics, and show that near-minimizers of the contrastive loss yield representations that are approximately sufficient, making them adaptable to diverse downstream tasks. I will first describe results for single-modal, augmentation-based contrastive learning, showing that contrastively learned encoders can be adapted to downstream tasks, with performance depending on their sufficiency and the augmentation-induced error. I will then extend the framework to multimodal settings and discuss implications for downstream tasks such as zero-shot classification, conditional diffusion models, and vision-language models. Together, these results provide a unified statistical perspective on why contrastively learned representations can be effectively adapted across tasks and modalities. |