ECE Department Seminar: Efficient Finetuning of Large Language Models via Large-Width Analysis

Oct 2, 2025

3 - 4:15pm EDT

Room 1, Remsen Hall Remsen Hall

Homewood Campus

3400 North Charles Street
Baltimore , Maryland 21218

This event is free

View flyer

Add event to calendar

Who can attend?

General public
Faculty
Staff
Students

Contact

Website

Description

Soufiane Hayou, an assistant professor in the Department of Applied Mathematics and Statistics at Johns Hopkins University and a member of the university's Data Science and AI Institute, will give a talk titled "Efficient Finetuning of Large Language Models via Large-Width Analysis" for the Department of Electrical and Computer Engineering.

Abstract:

Finetuning Large Language Models (LLMs) enhances their performance on downstream tasks—a desirable outcome if the model is used for a specific task. Parameter-efficient finetuning methods such as LoRA (Low-Rank Adaptation) are popular because they allow finetuning large models with relatively low cost. When using LoRA, two hyperparameters critically shape learning: learning rates and initialization. In this talk, I'll present two results. First, we prove and demonstrate that the two "zero-product" initializations (A random/B=0 vs. B random/A=0) are not equivalent: initializing B=0, A random permits larger stable learning rates and yields better performance, with an infinite-width stability analysis explaining the gap and LLM experiments confirming it. Second, LoRA+ shows that using the same learning rate for the A and B matrices is suboptimal at large width; a simple asymmetric LR scheme yields more efficient feature learning and delivers consistent accuracy gains and up to ~2× faster convergence at the same compute. Finally, I will distill these insights into practical defaults.

Who can attend?

General public
Faculty
Staff
Students

Contact

Website

Explore by Topic

Explore by Topic

News Network

Explore by Topic

Resources

Discover JHU

ECE Department Seminar: Efficient Finetuning of Large Language Models via Large-Width Analysis

Who can attend?

Contact

Description

Who can attend?

Contact

Featured

Trending