CS/CLSP Spring 2025 Seminar Series: Supervising Models that Are Smarter than Us
Description
Shi Feng, an assistant professor of computer science at George Washington University, will give a talk titled "Supervising Models that Are Smarter than Us" for the Department of Computer Science and the Center for Language and Speech Processing.
Abstract:
Advanced AI systems are being deployed for more and more complex tasks. To ensure reliable human oversight over AIs, we need supervision protocols that remain effective despite the increase in task complexity and model capabilities. Many approaches to this challenge involve assisting human supervisors with a second model, which can complement the human's weaknesses. However, this can also introduce new vulnerabilities. In this talk, I will discuss new research on both methods and threat models for assisted supervision protocols. I'll also share my thoughts on the meta-question of how we can make progress in scalable oversight, as well as how it overlaps with other AI safety research agendas.
Who can attend?
- Faculty
- Staff
- Students