CS Seminar: Dan Fu

March 7, 2024

10:45 - 11:45am EST

Room B-17, Hackerman Hall Hackerman Hall

Homewood Campus

3400 North Charles Street
Baltimore , Maryland 21218

This event is free

Add event to calendar

Who can attend?

Faculty
Staff
Students

Contact

Toni DeTallo

tdetallo@gmail.com

410-516-8775

Website

Description

Dan Fu, a computer science doctoral student at Stanford University, will give a talk titled "Hardware-Aware Efficient Primitives for Machine Learning" for the Department of Computer Science.

Abstract

Efficiency is increasingly tied to quality in machine learning, with more efficient training algorithms leading to more powerful models. However, today's most popular machine learning models are built on asymptotically inefficient primitives. For example, attention in transformers scales quadratically with input size, while multilayer perceptrons scale quadratically with model dimension. In this talk, Dan Fu discusses his work on improving the efficiency of core primitives in machine learning, with an emphasis on hardware-aware algorithms and long-context applications. First, he focuses on replacing attention with gated state space models (SSMs) and convolutions, which scale sub-quadratically in context length. He describes the H3 (Hungry Hungry Hippos) architecture, a gated SSM architecture that matches transformers in quality up to 3B parameters and achieves 2.4x faster inference. Second, he focuses on developing hardware-aware algorithms for SSMs and convolutions; he describes FlashFFTConv, a fast algorithm for computing SSMs and convolutions on GPU by optimizing the fast Fourier transform (FFT). FlashFFTConv yields up to 7x speedup and 5x memory savings, even over vendor solutions from NVIDIA. Third, he will briefly touch on how these same techniques can also be used to develop sub-quadratic scaling in the model dimension. He will describe Monarch Mixer, which uses a generalization of the FFT to achieve sub-quadratic scaling in both sequence length and model dimension. Throughout the talk, he will give examples of how these ideas are beginning to take hold, with gated SSMs and their variants now leading to state-of-the-art performance in long-context language models, embedding models, and DNA foundation models.

Who can attend?

Faculty
Staff
Students

Contact

Toni DeTallo

tdetallo@gmail.com

410-516-8775

Website

Explore by Topic

Explore by Topic

News Network

Explore by Topic

Resources

Discover JHU

CS Seminar: Dan Fu

Who can attend?

Contact

Description

Who can attend?

Contact

Featured

Trending