Upcoming seminars
16 octobre à 15:00 | Borjan Geshkovski | B211 |
28 octobre à 10:00 | Olivier Zahm | B211 |
7 novembre à 10:00 | Theron Guo | B211 |
28 novembre à 10:00 | Maria Laura Delle Monache | B211 |
-
Borjan Geshkovski (Inria MEGAVOLT), October 16th, 15:00pm, Room B211.
Dynamic metastability in the self-attention model
The pure self-attention model is a simplification of the celebrated Transformer architecture, which neglects multi-layer perceptron layers and includes only a single inverse temperature parameter. The model exhibits a remarkably similar qualitative behavior across layers to that observed empirically in a pre-trained Transformer. Viewing layers as a time variable, the self-attention model can be interpreted as an interacting particle system on the unit sphere. We show that when the temperature is sufficiently high, all particles collapse into a single cluster exponentially fast. On the other hand, when the temperature falls below a certain threshold, we show that although the particles eventually collapse into a single cluster, the required time is at least exponentially long. This is a manifestation of dynamic metastability: particles remain trapped in a “slow manifold” consisting of several clusters for exponentially long periods of time. Our proofs make use of the fact that the self-attention model can be written as the gradient flow of a specific interaction energy functional previously found in combinatorics.
-
Olivier Zahm (Inria AIRSEA), October 28th, 10:00am, Room B211.
Preconditioning Langevin dynamics via optimal Riemannian Poincaré inequalities
The Poincaré inequality is a key property for the convergence analysis of many practical algorithms, including MCMC samplers, dimension reduction methods etc. In this talk, we introduce a Riemannian version of the Poincaré inequality where a positive definite weighting matrix field (i.e. a Riemannian metric) is introduced to improve the Poincaré constant, and therefore the convergence speed of the resulting preconditioned Langevin dynamics. By leveraging the notion of *moment measure*, we prove the existence of an optimal metric which yields a Poincaré constant of 1. This optimal metric turns out to be a *Stein kernel*, offering a novel perspective on these complex but central mathematical objects that are hard to obtain in practice. We also present an implementable optimization algorithm to numerically obtain the optimal metric. The method’s effectiveness is illustrated through simple but non-trivial examples which reveals rather complex solutions. Lastly, we show how to design efficient Langevin-based sampling schemes which enables rapid jump across various modes and tails of the measure to be sampled from.
-
Theron Guo (MIT, visiting MATHERIALS in October and November), November 7th, 10:00am, Room B211.
Model order reduction for computational homogenization in nonlinear solid mechanics
Computational homogenization has become an indispensable method to establish the effective properties of microstructures and efficiently solve multiscale problems in solid mechanics. However, the resulting two-scale problem remains computationally expensive for nonlinear problems and is typically infeasible in multi-query contexts, such as optimization or uncertainty quantification. To alleviate the high computational costs, model order reduction techniques can be used. In this talk, I will introduce different variants of computational homogenization, and illustrate the effectiveness of projection-based model order reduction for two variants.
-
Maria Laura Delle Monache (UC Berkeley), November 28th, 10:00am, Room B211.
TBD
TBD