| Topic: | Towards Understanding Multi-Step Reasoning in Transformers: Case Studies in In-Context Learning |
| Date: | 10/02/2026 |
| Time: | 2:30 pm - 3:30 pm |
| Venue: | NAH 213 |
| Category: | Latest Seminars and Events |
| Speaker: | Professor Yuan Cao |
| PDF: | PROF-Yuan-Cao_10-FEB-2026.pdf |
| Details: | Abstract Transformers have demonstrated impressive capabilities in solving complex reasoning tasks. One prevalent approach to enhancing the reasoning performance of language models is chain-of-thought (CoT), where the model generates explicit intermediate reasoning steps before arriving at a final answer—an example of “horizontal” reasoning performed sequentially within a single inference. Complementing this perspective, recent research has also investigated “vertical” reasoning through “implicit CoT”, which examines how reasoning unfolds across the layers of a deep model. Together, these approaches offer new insights into the reasoning mechanisms in transformers. However, most existing studies of reasoning focus on natural language tasks, which can be less formal and more qualitative in nature. In this talk, we explore transformer reasoning in the more mathematically concrete setting of in-context learning, where the input to the model forms a training dataset for a specific learning task, and the model is required to output a predictor fitted from this data. Focusing on in-context linear regression and linear logistic regression, we present both experimental and theoretical results investigating when and how tasks performed by explicit CoT can be replicated by implicit CoT. We also identify an interesting setting in which a model guided by a teacher reasoning path may instead learn a different, but more efficient, reasoning path. These results provide rigorous case studies that illuminate the capabilities and limitations of multi-step reasoning in transformer models. |