CLSP Seminar Series: Auditing Memorization, Dissecting Mechanisms, and Evaluating Behavior of Large Language Models

Sept 26, 2025

12 - 1:15pm EDT

Room B17, Hackerman Hall Hackerman Hall

Homewood Campus

3400 North Charles Street
Baltimore , Maryland 21218

Registration is required

This event is free

Add event to calendar

Who can attend?

Faculty
Staff
Students

Contact

Center for Language and Speech Processing

Website

Description

Robin Jia, an assistant professor of computer science at the University of Southern California, will give a talk titled "Auditing Memorization, Dissecting Mechanisms, and Evaluating Behavior of Large Language Models" for the Center for Language and Speech Processing.

Abstract:

The widespread adoption of large language models (LLMs) places a responsibility on the AI research community to rigorously study and understand them. In this talk, I will describe my group's research on analyzing LLMs' memorization of pre-training data, their internal mechanisms, and their downstream behavior. First, I will introduce the Hubble project, in which we have pre-trained LLMs (up to 8B parameters) on controlled pre-training corpora to understand when and how they memorize sensitive data related to copyright risks, privacy leakage, and test set contamination; we envision these models as a valuable open-source resource for scientific inquiry into LLM memorization. Next, I will describe my group's work on understanding how language models work internally, including vignettes about how they perform arithmetic with Fourier features and how they can learn optimization subroutines for in-context learning. Finally, I will highlight a recent collaboration with USC oncologists in which we uncover LLM sycophancy issues that arise when patients ask these models for medical advice.

Who can attend?

Faculty
Staff
Students

Registration

Registration is required

Please register in advance

Contact

Center for Language and Speech Processing

Website

Explore by Topic

Explore by Topic

News Network

Explore by Topic

Resources

Discover JHU

CLSP Seminar Series: Auditing Memorization, Dissecting Mechanisms, and Evaluating Behavior of Large Language Models

Who can attend?

Contact

Description

Who can attend?

Registration

Contact

Featured

Trending