New preprint

Raphaël Barboni · July 23, 2025

We published a new preprint on “Ultra-fast feature learning for the training of two-layer neural networks in the two-timescale regime”.

Considering a Variable Projection [1] or two-timescale learning [2] strategy, we show that during training, the distribution of inner weights of two-layer neural networks evolve according to an ultra-fast diffusion equation [3].

References

[1] G. H. Golub, V. Pereyra. The differentiation of pseudo-inverses and nonlinear least squares problems whose variables separate. SIAM Journal on numerical analysis (1973).

[2] P. Marion, R. Berthier. Leveraging the two-timescale regime to demonstrate convergence of neural networks. Advances in Neural Information Processing System (2023).

[3] M. Iacobelli, F. S. Patacchini, F. Santambrogio. Weighted ultrafast diffusion equations: from well-posedness to long-time behaviour. Archive for Rational Mechanics and Analysis (2019).

Twitter, Facebook