• Upcoming Events
  • Awards
  • Distinguished Lecture
  • Latest Seminars and Events
  • Others
  • Seminars
  • Workshop and Conference
  • Past Events
  • Student Issue
Upcoming Events
Topic:Towards modern datasets: laying mathematical foundations to streamline machine learning
Date:10/12/2024
Time:10:30 am - 11:30 am
Venue:Lady Shaw Building LT2
Category:Seminars
Speaker:Mr. Chen CHENG
PDF:MR-Chen-CHENG_10-Dec-2024.pdf
Details:

Abstract

Datasets are central to the development of statistical learning theory, and the evolution of models. The burgeoning success of modern machine learning in sophisticated tasks crucially relies on the vast growth of massive datasets, such as ImageNet, SuperGLUE and Laion-5b. However, such evolution breaks standard statistical learning assumptions and tools. In this talk, I will present two stories tackling challenges modern datasets present, and leverage statistical theory to shed insight into how should we streamline modern machine learning.
In the first part, we study multilabeling—a curious aspect of modern human-labeled datasets that is often missing in statistical machine learning literature. We develop a stylized theoretical model to capture uncertainties in the labeling process, allowing us to understand the contrasts, limitations and possible improvements of using aggregated or non-aggregated data in a statistical learning pipeline.
In the second part, I will present novel theoretical tools that are not simply convenient from classical literature, such as random matrix theory under proportional regime. Theoretical tools for proportional regime are crucially helpful in understanding “benign-overfitting” and “memorization”. This is not always the most natural setting in statistics where columns correspond to covariates and rows to samples.With the objective to move beyond the proportional asymptotics, we revisit
ridge regression (ℓ2-penalized least squares) on i.i.d. data X ∈ Rn×d, y ∈ Rn. We allow the feature vector to be infinite-dimensional (d = ∞), in which case it belongs to a separable Hilbert space.

İstanbul escort mersin escort kocaeli escort sakarya escort antalya Escort adana Escort escort bayan escort mersin İstanbul escort bayan mersin escort kocaeli escort sakarya escort antalya Escort adana Escort escort bayan escort mersin