Deep Learning Seminar – Dmitry Yarotsky – Universal Formulas and Neural Networks
Oct 18, 2024
11:00AM to 12:00PM
Date/Time
Date(s) - 18/10/2024
11:00 am - 12:00 pm
Area: Approximation Theory of Deep Neural Networks
Location: Zoom
https://mcmaster.zoom.us/j/91523393302
Meeting ID: 915 2339 3302
Speaker: Dmitry Yarotsky
Title: Universal Formulas and Neural Networks
Abstract: Universal formulas are parameterized analytical expressions of fixed complexity that can approximate any continuous function. The Universal Approximation Theorem states that expressions like a single-layer neural network can approximate any function if the number of neurons is increased indefinitely. However, it turns out that with a certain choice of activation functions and network architecture, one can achieve this with a fixed number of neurons. This raises two natural follow-up questions: 1) What structural conditions enable this possibility? and 2) Can the parameters of such a network be learned through gradient descent (GD)? Regarding the first question, it can be inferred from the theory of Pfaffian functions that the neural network must include the sine function with an unbounded argument. From the theory of algebraically transcendental functions and Van der Waerden’s theorem, it follows that single-layer networks with classical activations satisfy algebraic equations and therefore cannot be finitely universal. Regarding the second question, if the dimensionality of the target function space exceeds the number of parameters W, gradient descent cannot learn arbitrary target functions. In particular, Borsuk-Ulam’s theorem implies that any set of target functions homeomorphic to a W-sphere contains unlearnable targets. On the other hand, if a probabilistic measure is defined on the space of targets, a model can be constructed that GD-learns a fat Cantor set of targets with arbitrarily high probability.