image

hillbig Theoretical analysis of deep learning by Dr. Daiji Suzuki, especially on representation capability, generalization capability, and optimization theory. He covers a wide range of important topics, including the latest Neural Tangent Kernel and dual effect. I don’t think there is anything as comprehensive as this in English.

I get an error when I access the original Slideshare, I can see it on X/Twitter, cache? imageimage

image As an easy-to-understand concrete example, in the case of a function whose value is determined by the distance from the origin, four layers would be of polynomial order with respect to the number of dimensions (I think it’s linear, frankly).

Generalization error bound

  • skip this spot

Approximation performance by function class

image image

image

  • kernel ridge regression
  • adaptive method
    • deep learning
    • sparse estimation
      • I guess if you have too many things to prepare in advance, it becomes impractical.

Bezov space image

image image The various function classes mentioned in past discussions are special cases of [Bezov space

image image imageSparsity. image

image image

image Neural Tangent Kernel Mean Field image


This page is auto-translated from /nishio/鈴木大慈-深層学習の数理 using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. I’m very happy to spread my thought to non-Japanese readers.