Upcoming seminars
19 décembre à 10:00 | Richard Kraaij | B211 | Seminar |
9 janvier à 10:00 | Raphaël Barboni | TBD | Seminar |
16 janvier à 10:30 | Pierre-Cyril Aubin | B211 | Seminar |
30 janvier à 10:30 | Guillaume Chennetier | TBD | Seminar |
17/19/20 février | Emma Horton | TBD | Seminar |
6 mars à 10:00 | Eloi Tanguy | TBD | Seminar |
-
Richard Kraaij (TU Delft), Thursday December 19th, 10:00am, Room B211.
Well-posedness for Hamilton-Jacobi equations for stochastic control problems: A new view on the classical approach using couplings
Stochastic control problems for controlled Markov processes can be infinitesimally characterized using a second order Hamilton-Jacobi-Bellman (HJB) equation. The classical work of Crandall-Ishii-Lions (1992) establishes how to obtain uniqueness of viscosity solutions for controlled diffusion processes, and a collection of recent works has pushed the estimates to include classes of spatially inhomogeneous controlled Lévy processes.
We introduce a new perspective of the classical proof methods in terms of Markovian couplings, putting both sets of results mentioned above in a common framework. The new perspective also enables an approach in new contexts, e.g. in that of Riemannian manifolds, which, if time allows, I will briefly discuss.
Based on joint work with Serena Della Corte (Delft), Fabian Fuchs (Bielefeld) and Max Nendel (Bielefeld) and work in progress with Karen Habermann (Warwick) Rik Versendaal (Delft).
-
Raphaël Barboni (ENS Ulm), Thursday January 9th, 10:00am, Room TBD.
Understanding the training of infinitely deep and wide ResNets with Conditional Optimal Transport
We study the convergence of gradient flow for the training of deep neural networks. If Residual Neural Networks are a popular example of very deep architectures, their training constitutes a challenging optimization problem due notably to the non-convexity and the non-coercivity of the objective. Yet, in applications, those tasks are successfully solved by simple optimization algorithms such as gradient descent. To better understand this phenomenon, we focus here on a “mean-field” model of infinitely deep and arbitrarily wide ResNet, parameterized by probability measures over the product set of layers and parameters and with constant marginal on the set of layers. Indeed, in the case of shallow neural networks, mean field models have proven to benefit from simplified loss-landscapes and good theoretical guarantees when trained with gradient flow for the Wasserstein metric on the set of probability measures. Motivated by this approach, we propose to train our model with gradient flow w.r.t. the conditional Optimal Transport distance: a restriction of the classical Wasserstein distance which enforces our marginal condition. We first show the well-posedness of the gradient flow equation and then its local convergence around well-chosen initializations. This is joint work with G.Peyré and F.-X. Vialard.
-
Pierre-Cyril Aubin (CERMICS), Thursday January 16th, 10:30am, Room B211.
EVIs and gradient descent with c(x,y) cost, and alternating minimization
How to go beyond the square distance d^2 in optimization algorithms and flows in metric spaces? Replacing it with a general cost function c(x,y) and using a majorize-minimize framework I will detail a generic class of algorithms encompassing Newton/mirror/natural/Riemannian gradient descent by reframing them as an alternating minimization, each for a different cost c(x,y). Rooted in cross-differences, the convergence theory to the infimum and to the continuous flow is investigated is based on a (discrete) evolution variational inequality (EVI) which enjoys similar properties to the EVI with d^2 regularizer. This provides a theoretical framework for studying splitting schemes beyond the usual implicit Euler in gradient flows. This talk is based on works with Flavien Léger (INRIA Paris), Giacomo Sodini and Ulisse Stefanelli (Uni Vienna).
-
Guillaume Chennetier (CERMICS), Thursday January 30th, 10:30am, TBD.
TBD
TBD
-
Emma Horton (University of Warwick), February 17th, 18th or 20th at 10:00, Room TBD
Monte Carlo methods for branching processes
Branching processes naturally arise as pertinent models in a variety of situations such as cell division, population dynamics and nuclear fission. For a wide class of branching processes, it is common that their first moment exhibits a Perron Frobenius-type decomposition. That is, the first order asymptotic behaviour is described by a triple $(\lambda, \varphi, \eta)$, where $\lambda$ is the leading eigenvalue of the system and $\varphi$ and $\eta$ are the corresponding right eigenfunction and left eigenmeasure respectively. Thus, obtaining good estimates of these quantities is imperative for understanding the long-time behaviour of these processes. In this talk, we discuss various Monte Carlo methods for estimating this triple. This talk is based on joint work with Alex Cox (University of Bath) and Denis Villemonais (Université de Lorraine).
-
Eloi Tanguy (Université Paris-Cité), Thursday March 6th, 10:00am, Room TBD.
TBD
TBD
Past seminars (2024-2025)
-
Borjan Geshkovski (Inria MEGAVOLT), October 16th, 15:00pm, Room B211.
Dynamic metastability in the self-attention model
The pure self-attention model is a simplification of the celebrated Transformer architecture, which neglects multi-layer perceptron layers and includes only a single inverse temperature parameter. The model exhibits a remarkably similar qualitative behavior across layers to that observed empirically in a pre-trained Transformer. Viewing layers as a time variable, the self-attention model can be interpreted as an interacting particle system on the unit sphere. We show that when the temperature is sufficiently high, all particles collapse into a single cluster exponentially fast. On the other hand, when the temperature falls below a certain threshold, we show that although the particles eventually collapse into a single cluster, the required time is at least exponentially long. This is a manifestation of dynamic metastability: particles remain trapped in a “slow manifold” consisting of several clusters for exponentially long periods of time. Our proofs make use of the fact that the self-attention model can be written as the gradient flow of a specific interaction energy functional previously found in combinatorics.
-
Olivier Zahm (Inria AIRSEA), October 28th, 10:00am, Room B211.
Preconditioning Langevin dynamics via optimal Riemannian Poincaré inequalities
The Poincaré inequality is a key property for the convergence analysis of many practical algorithms, including MCMC samplers, dimension reduction methods etc. In this talk, we introduce a Riemannian version of the Poincaré inequality where a positive definite weighting matrix field (i.e. a Riemannian metric) is introduced to improve the Poincaré constant, and therefore the convergence speed of the resulting preconditioned Langevin dynamics. By leveraging the notion of *moment measure*, we prove the existence of an optimal metric which yields a Poincaré constant of 1. This optimal metric turns out to be a *Stein kernel*, offering a novel perspective on these complex but central mathematical objects that are hard to obtain in practice. We also present an implementable optimization algorithm to numerically obtain the optimal metric. The method’s effectiveness is illustrated through simple but non-trivial examples which reveals rather complex solutions. Lastly, we show how to design efficient Langevin-based sampling schemes which enables rapid jump across various modes and tails of the measure to be sampled from.
-
Theron Guo (MIT, visiting MATHERIALS in October and November), November 7th, 10:00am, Room B211.
Model order reduction for computational homogenization in nonlinear solid mechanics
Computational homogenization has become an indispensable method to establish the effective properties of microstructures and efficiently solve multiscale problems in solid mechanics. However, the resulting two-scale problem remains computationally expensive for nonlinear problems and is typically infeasible in multi-query contexts, such as optimization or uncertainty quantification. To alleviate the high computational costs, model order reduction techniques can be used. In this talk, I will introduce different variants of computational homogenization, and illustrate the effectiveness of projection-based model order reduction for two variants.
-
Maria Laura Delle Monache (UC Berkeley), Thursday November 28th, 10:00am, Room B211.
Control Strategies for Mixed Autonomy Traffic: theory, simulations and real-life experiments.
The recent and rapid emergence of disruptive technologies is dramatically changing how traffic is monitored and managed in our cities. They will contribute to generate new knowledge and capabilities to design and implement innovative transport policies. In this talk, we will show how we can exploit new technologies to improve traffic management. We will focus on control strategies for traffic systems with the aid of small fleets of connected and automated vehicles immersed in human driven traffic flow. We present a class of coupled PDE-ODE models describing the interaction of autonomous vehicles (AVs) with the surrounding traffic. The model consists of a scalar conservation law for the main traffic flow, coupled with ordinary differential equations describing the possibly interacting AV trajectories. We will prove analytically and numerically how the proposed control theory can improve traffic performance and finally, we will present the MegaVanderTest, a test involving 100 connected and automated vehicles (CAVs). The MegaVanderTest is to our knowledge the field test which achieved the largest concentration of CAVs collaboratively controlling traffic on a single stretch of freeway.
-
Andy Philpott (University of Auckland), Monday December 2nd, 14:00, Room B211.
10 Challenges for mathematical modeling of energy transition
The Architecture of Green Energy Systems program was a 10-week joint research project funded by the Institute of Mathematical and Statistical Innovation at the University of Chicago in the summer of 2024. An outcome of this project was a draft review paper based on discussions of participants at the closing workshop. This paper identifies ten challenges for the mathematical modelling community that will be crucial to address in planning and implementing the transition to a net-zero carbon energy system.
In this talk I will give a broad overview of the challenges that were identified with a focus on modelling challenges arising in the conversion of household, transport and industrial energy use to renewable electricity. This will require enormous investments in renewable electricity generation, storage and transmission. I will give some examples of models used to help plan this process, and discuss the challenges that still have to be addressed to make the models useful to policy makers, regulators and investors.
-
Immanuel Bomze (Univ. Vienna), Monday December 2nd, 15:30, Room B211.
First-order methods for the impatient – support identification in finite time with Frank/Wolfe-variants
We study active set identification results for the away-step Frank-Wolfe algorithm in different settings. We first prove a local identification property that we apply, in combination with a convergence hypothesis, to get an active set identification result. We then prove, in the nonconvex case, a novel $O(1/\sqrt{k})$ convergence rate result and active set identification for different step sizes (under suitable assumptions on the set of stationary points). By exploiting those results, we also give explicit active set complexity bounds for both strongly convex and nonconvex objectives. While we initially consider the probability simplex as feasible set, time permitting we show how to adapt some of our results to generic polytopes. A particular case with interesting applications covers projection-free methods on product domains.
-
Feliks Nüske (Max Planck Institute), Tuesday December 17th, 10:00am, Room B211.
Approximating Metastable Dynamics with Random Fourier Features
Metastablility is a phenomenon which often inhibits the efficient simulation of dynamical systems, or the generation of samples from high-dimensional probability measures. In particular, metastability is frequently encountered in computer simulations of biological macromolecules using molecular dynamics. It is well-known that metastable transitions and their time scales are encoded in the dominant spectrum of certain transition operators, also called Koopman operators. The study of Koopman operators, and their data-driven approximation by algorithms like the Extended Dynamic Mode Decomposition (EDMD), have gained significant traction in the study of dynamical systems, and have led to widespread application.In this talk, I will report on recent progress concerning the data-driven analysis of metastable systems using Koopman operators. First, I will introduce approximation methods on reproducing kernel Hilbert spaces (RKHS), which allow the use of rich approximation spaces, and explain how the resulting large-scale linear problems can be solved efficiently using random Fourier features (RFF). Second, I will explain how similar ideas can be applied to learn models for the infinitesimal generator, which allows for a more detailed system analysis, including the definition of coarse grained models. -
Richard Kraaij (TU Delft), Thursday December 19th, 10:00am, Room B211.
Well-posedness for Hamilton-Jacobi equations for stochastic control problems: A new view on the classical approach using couplings
Stochastic control problems for controlled Markov processes can be infinitesimally characterized using a second order Hamilton-Jacobi-Bellman (HJB) equation. The classical work of Crandall-Ishii-Lions (1992) establishes how to obtain uniqueness of viscosity solutions for controlled diffusion processes, and a collection of recent works has pushed the estimates to include classes of spatially inhomogeneous controlled Lévy processes.
We introduce a new perspective of the classical proof methods in terms of Markovian couplings, putting both sets of results mentioned above in a common framework. The new perspective also enables an approach in new contexts, e.g. in that of Riemannian manifolds, which, if time allows, I will briefly discuss.
Based on joint work with Serena Della Corte (Delft), Fabian Fuchs (Bielefeld) and Max Nendel (Bielefeld) and work in progress with Karen Habermann (Warwick) Rik Versendaal (Delft).