CS Distinguished Speaker Seminar Series: Agreement and Alignment for Human-AI Collaboration
Description
Aaron Roth, a professor of computer and cognitive science in the Department of Computer and Information Science at the University of Pennsylvania, will give a talk titled "Agreement and Alignment for Human-AI Collaboration" for the Department of Computer Science.
This is a hybrid event; to attend virtually, use the Zoom link.
Abstract
As AI models become increasingly powerful, it is an attractive proposition to use them in important decision-making pipelines in collaboration with human decision-makers. But how should a human being and a machine learning model collaborate to reach decisions that are better than either of them could achieve on their own? If the human and the AI model were perfect Bayesians, operating in a setting with a commonly known and correctly specified prior, Aumann's classical agreement theorem would give us one answer: They could engage in conversation about the task at hand, and their conversation would be guaranteed to converge to (accuracy-improving) agreement. This classical result, however, would require making many implausible assumptions, both about the knowledge and computational power of both parties. We show how to recover similar (and more general) results using only computationally and statistically tractable assumptions, which substantially relax full Bayesian rationality. In the second part of the talk, we go on to consider a more difficult problem: that the AI model might be acting at least in part to advance the interests of its designer, rather than the interests of its user, which might be in tension. We show how market competition between different AI providers can mitigate this problem assuming only a mild "market alignment" assumption—that the user's utility function lies in the convex hull of the AI providers' utility functions—even when no single provider is well aligned. In particular, we show that in all Nash equilibria of the AI providers under this market alignment condition, the user is able to advance her own goals as well as they could have in collaboration with a perfectly aligned AI model.
This talk describes the results of three papers—Tractable Agreement Protocols (2025 ACM Symposium on Theory of Computing), Collaborative Prediction: Tractable Information Aggregation via Agreement (ACM-SIAM Symposium on Discrete Algorithms), and Emergent Alignment from Competition—which are joint works with Natalie Collina, Ira Globus-Harris, Surbhi Goel, Varun Gupta, Emily Ryu, and Mirah Shi.
Who can attend?
- Faculty
- Staff
- Students