Computer Science Seminar: Sang Michael Xie

March 25, 2024

12 - 1:15pm EDT

Room B-17, Hackerman Hall Hackerman Hall

Homewood Campus

3400 North Charles Street
Baltimore , Maryland 21218

This event is free

Add event to calendar

Who can attend?

Faculty
Staff
Students

Contact

Toni DeTallo

tdetall1@jhu.edu

410-516-8775

Website

Description

Sang Michael Xie, a computer science doctoral student studying machine learning at Stanford University, will give a talk titled "Data-Distribution-Centric Machine Learning for Generalizable Language Models" for the Department of Computer Science.

Abstract:

High-quality datasets are crucial for improving the capabilities and training efficiency of large language models. However, current datasets are typically prepared in an ad hoc, heuristic way. In this talk, Sang Michael Xie will present principled approaches to improving and understanding language models centered on the pre-training data distribution. First, he will describe how to improve the efficiency of training multipurpose language models by optimizing the mixture of data sources with robust optimization. Second, he will discuss an efficient importance resampling method for selecting relevant data from trillion-token-scale web datasets for training a specialized model. Finally, he will introduce a first theoretical analysis of in-context learning, a key capability of language models to learn from examples in a textual prompt, that traces the capability back to modeling coherence structure in the pre-training data.

Who can attend?

Faculty
Staff
Students

Contact

Toni DeTallo

tdetall1@jhu.edu

410-516-8775

Website

Explore by Topic

Explore by Topic

News Network

Explore by Topic

Resources

Discover JHU

Computer Science Seminar: Sang Michael Xie

Who can attend?

Contact

Description

Who can attend?

Contact

Featured

Trending