CS/CLSP Seminar: Xinyu "Crystina" Zhang
Description
Xinyu "Crystina" Zhang, a doctoral candidate at the University of Waterloo, will give a talk titled "Information Seeking Beyond English" as a joint seminar for the Department of Computer Science and the Center for Language and Speech Processing.
Abstract:
Pre-trained language models have brought revolutionary progress to information-seeking in the English world. While the advance is exciting, how to transfer such progress into non-English—and especially lower-resource—languages presents new challenges that require developing new resources and methodologies. In this talk, Xinyu "Crystina" Zhang will present her research on building effective information-seeking systems for non-English speakers. She will begin by introducing the benchmarks and datasets developed to support the evaluation and training of the multilingual search systems. These resources have since become widely adopted within the community and enable the development of effective multilingual embedding models. The next part of her talk will share the best training practices found in such model development, including strategies for enhancing backbone models and surprising transfer effects across languages. Building on these foundations, Zhang's work expanded to understand how language models process multilingual text and facilitate knowledge transfer across languages. Her talk will conclude with a vision for the future of multilingual language model development, with the goal of adapting these models to unseen languages with minimal data and resource requirements and thus bridging the gap for underrepresented linguistic communities.
Who can attend?
- Faculty
- Staff
- Students