IDIES Seminar | Causal Foundation Models: Disentangling Physics from Systematics
Description
Daniel Muthukrishna, an astrophysicist and machine learning research scientist at MIT and an astroAI fellow at Harvard's Center for Astrophysics, will give a talk titled "Causal Foundation Models: Disentangling Physics from Systematics" for the Institute for Data-Intensive Engineering and Science (IDIES) at the Bloomberg Center for Physics and Astronomy.
Abstract:
Foundation models for scientific data must contend with a fundamental challenge: observations often conflate the true underlying physical phenomena with systematic distortions introduced by measurement instruments. This entanglement limits model generalization, especially in heterogeneous or multi-instrument settings.
In this talk, I present a causally motivated foundation model that explicitly disentangles physical and instrumental factors using a dual-encoder architecture trained with structured contrastive learning or a generative flow-matching model. Leveraging naturally occurring observational triplets (i.e., where the same target is measured under varying conditions, and distinct targets are measured under shared conditions), the model learns separate latent representations for the underlying physical signal and instrument effects. Evaluated on simulated astronomical time series designed to resemble the complexity of variable stars observed by missions like NASA's Transiting Exoplanet Survey Satellite (TESS), the method outperforms traditional single-latent space foundation models on downstream prediction tasks, particularly in low-data regimes. These results demonstrate that our model supports key capabilities of foundation models, including few-shot generalization and efficient adaptation, and highlight the importance of encoding causal structure into representation learning for structured data.
Who can attend?
- Faculty
- Staff
- Students