Depth Separations in Neural Networks: What is Actually Being Separated?

Research output: Contribution to journalConference articlepeer-review

Abstract

Existing depth separation results for constant-depth networks essentially show that certain radial functions in Rd, which can be easily approximated with depth 3 networks, cannot be approximated by depth 2 networks, even up to constant accuracy, unless their size is exponential in d. However, the functions used to demonstrate this are rapidly oscillating, with a Lipschitz parameter scaling polynomially with the dimension d (or equivalently, by scaling the function, the hardness result applies to O(1)-Lipschitz functions only when the target accuracy ε is at most poly(1/d)). In this paper, we study whether such depth separations might still hold in the natural setting of O(1)-Lipschitz radial functions, when ε does not scale with d. Perhaps surprisingly, we show that the answer is negative: In contrast to the intuition suggested by previous work, it is possible to approximate O(1)-Lipschitz radial functions with depth 2, size poly(d) networks, for every constant ε. We complement it by showing that approximating such functions is also possible with depth 2, size poly(1/ε) networks, for every constant d. Finally, we show that it is not possible to have polynomial dependence in both d, 1/ε simultaneously. Overall, our results indicate that in order to show depth separations for expressing O(1)-Lipschitz functions with constant accuracy – if at all possible – one would need fundamentally different techniques than existing ones in the literature.

Original languageAmerican English
Pages (from-to)2664-2666
Number of pages3
JournalProceedings of Machine Learning Research
Volume99
StatePublished - 1 Jan 2019
Event32nd Conference on Learning Theory, COLT 2019 - Phoenix, United States
Duration: 25 Jun 201928 Jun 2019
https://proceedings.mlr.press/v99

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Statistics and Probability

Cite this