Réunion d'hiver SMC 2025

Toronto, 5 - 8 decembre 2025

       

Mathématiques de l'apprentissage automatique
Org: Ben Adcock (Simon Fraser University), Ricardo Baptista (University of Toronto) et Giang Tran (University of Waterloo)

ISAAC GIBBS, University of California, Berkeley

AVI GUPTA, Simon Fraser University

MOHAMED HIBAT-ALLAH, University of Waterloo

SPENCER HILL, Queen’s University

ANASTASIS KRATSIOS, McMaster University

SOPHIE MORIN, Polytechnique Montreal

RACHEL MORRIS, Concordia University

CAMERON MUSCO, University of Massachusetts Amherst

ESHA SAHA, University of Alberta

MATTHEW THORPE, Warwick University

ALEX TOWNSEND, Cornell University

YUNAN YANG, Cornell University
Training Distribution Optimization in the Space of Probability Measures  [PDF]

A central question in data-driven modeling is: from which probability distribution should training samples be drawn to most effectively approximate a target function or operator? This work addresses this question in the setting where “effectiveness” is measured by out-of-distribution (OOD) generalization accuracy across a family of downstream tasks. We formulate the problem as minimizing the expected OOD generalization error, or an upper bound thereof, over the space of probability measures. The optimal sampling distribution depends jointly on the model class (e.g., kernel regressors, neural networks), the evaluation metric, and the target map itself. Building on this characterization, we propose two adaptive, target-dependent data selection algorithms based on bilevel and alternating optimization. The resulting surrogate models exhibit significantly improved robustness to distributional shifts and consistently outperform models trained with conventional, non-adaptive, or target-independent sampling across benchmark problems in function approximation, operator learning, and inverse modeling.


© Société mathématique du Canada : http://www.smc.math.ca/